Papers
PILOT-Bench: A Benchmark for Legal Reasoning in the Patent Domain with IRAC-Aligned Classification Tasks
Yehoon Jang, Chaewon Lee, Hyun-seok Min et al.
PIP: Perturbation-based Iterative Pruning for Large Language Models
Yi Cao, Wei-Jie Xu, Yucheng Shen et al.
Pi-SQL: Enhancing Text-to-SQL with Fine-Grained Guidance from Pivot Programming Languages
Yongdong Chi, Hanqing Wang, Yun Chen et al.
Pixels Versus Priors: Controlling Knowledge Priors in Vision-Language Models through Visual Counterfacts
Michal Golovanevsky, William Rudman, Michael A. Lepori et al.
Plan Dynamically, Express Rhetorically: A Debate-Driven Rhetorical Framework for Argumentative Writing
Xueguan Zhao, Wenpeng Lu, Chaoqun Zheng et al.
PlanGEN: A Multi-Agent Framework for Generating Planning and Reasoning Trajectories for Complex Problem Solving
Mihir Parmar, Xin Liu, Palash Goyal et al.
PlanGPT-VL: Enhancing Urban Planning with Domain-Specific Vision-Language Models
He Zhu, Junyou Su, Minxin Chen et al.
Planning-Aware Code Infilling via Horizon-Length Prediction
Yifeng Ding, Hantian Ding, Shiqi Wang et al.
PLAN-TUNING: Post-Training Language Models to Learn Step-by-Step Planning for Complex Problem Solving
Mihir Parmar, Palash Goyal, Xin Liu et al.
Playpen: An Environment for Exploring Learning From Dialogue Game Feedback
Nicola Horst, Davide Mazzaccara, Antonia Schmidt et al.
Please Translate Again: Two Simple Experiments on Whether Human-Like Reasoning Helps Translation
Di Wu, Seth Aycock, Christof Monz
PledgeTracker: A System for Monitoring the Fulfilment of Pledges
Yulong Chen, Michael Sejr Schlichtkrull, Zhenyun Deng et al.
PLLuM-Align: Polish Preference Dataset for Large Language Model Alignment
Karolina Seweryn, Anna Kołos, Agnieszka Karlińska et al.
Plugging Schema Graph into Multi-Table QA: A Human-Guided Framework for Reducing LLM Reliance
Xixi Wang, Miguel Costa, Jordanka Kovaceva et al.
Pluralistic Alignment for Healthcare: A Role-Driven Framework
Jiayou Zhong, Anudeex Shetty, Chao Jia et al.
Plutus: Benchmarking Large Language Models in Low-Resource Greek Finance
Xueqing Peng, Triantafillos Papadopoulos, Efstathia Soufleri et al.
P-MMEval: A Parallel Multilingual Multitask Benchmark for Consistent Evaluation of LLMs
Yidan Zhang, Yu Wan, Boyi Deng et al.
PMPO: Probabilistic Metric Prompt Optimization for Small and Large Language Models
ChenZhuo Zhao, Ziqian Liu, Xinda Wang et al.
Pointing to a Llama and Call it a Camel: On the Sycophancy of Multimodal Large Language Models
Renjie Pi, Kehao Miao, Li Peihang et al.
POINTS-Reader: Distillation-Free Adaptation of Vision-Language Models for Document Conversion
Yuan Liu, Zhongyin Zhao, Le Tian et al.
PolBiX: Detecting LLMs’ Political Bias in Fact-Checking through X-phemisms
Charlott Jakob, David Harbecke, Patrick Parschan et al.
Polish-English medical knowledge transfer: A new benchmark and results
Łukasz Grzybowski, Jakub Pokrywka, Michał Ciesiółka et al.
PolitiSky24: U.S. Political Bluesky Dataset with User Stance Labels
Peyman Rostami, Vahid Rahimzadeh, Ali Adibi et al.
polyBART: A Chemical Linguist for Polymer Property Prediction and Generative Design
Anagha Savit, Harikrishna Sahu, Shivank S. Shukla et al.
PolyNorm: Few-Shot LLM-Based Text Normalization for Text-to-Speech
Michel Wong, Ali Alshehri, Sophia Kao et al.