Papers
Preference Curriculum: LLMs Should Always Be Pretrained on Their Preferred Data
Xuemiao Zhang, Xu Liangyu, Feiyu Duan et al.
PreP-OCR: A Complete Pipeline for Document Image Restoration and Enhanced OCR Accuracy
Shuhao Guan, Moule Lin, Cheng Xu et al.
PreSumm: Predicting Summarization Performance Without Summarizing
Steven Koniaev, Ori Ernst, Jackie CK Cheung
Pretraining Context Compressor for Large Language Models with Embedding-Based Memory
Yuhong Dai, Jianxun Lian, Yitian Huang et al.
Pre-Training Curriculum for Multi-Token Prediction in Language Models
Ansar Aynetdinov, Alan Akbik
Pre-training Distillation for Large Language Models: A Design Space Exploration
Hao Peng, Xin Lv, Yushi Bai et al.
Pretraining Strategies using Monolingual and Parallel Data for Low-Resource Machine Translation
Idriss Nguepi Nguefack, Mara Finkelstein, Toadoum Sari Sakayo
Preventing Rogue Agents Improves Multi-Agent Collaboration
Ohav Barbi, Ori Yoran, Mor Geva
Principal Parts Detection for Computational Morphology: Task, Models and Benchmark
Dorin Keshales, Omer Goldman, Reut Tsarfaty
Principled Content Selection to Generate Diverse and Personalized Multi-Document Summaries
Vishakh Padmakumar, Zichao Wang, David Arbour et al.
Principled Understanding of Generalization for Generative Transformer Models in Arithmetic Reasoning Tasks
Xingcheng Xu, Zibo Zhao, Haipeng Zhang et al.
PRISM: A Framework for Producing Interpretable Political Bias Embeddings with Political-Aware Cross-Encoder
Yiqun Sun, Qiang Huang, Anthony Kum Hoe Tung et al.
PrivaCI-Bench: Evaluating Privacy with Contextual Integrity and Legal Compliance
Haoran Li, Wenbin Hu, Huihao Jing et al.
Privacy Preserving Data Selection for Bias Mitigation in Speech Models
Alkis Koudounas, Eliana Pastor, Vittorio Mazzia et al.
PrivacyRestore: Privacy-Preserving Inference in Large Language Models via Privacy Removal and Restoration
Ziqian Zeng, Jianwei Wang, Junyao Yang et al.
Privacy Ripple Effects from Adding or Removing Personal Information in Language Model Training
Jaydeep Borkar, Matthew Jagielski, Katherine Lee et al.
Private Memorization Editing: Turning Memorization into a Defense to Strengthen Data Privacy in Large Language Models
Elena Sofia Ruzzetti, Giancarlo A. Xompero, Davide Venditti et al.
PRMBench: A Fine-grained and Challenging Benchmark for Process-Level Reward Models
Mingyang Song, Zhaochen Su, Xiaoye Qu et al.
Proactive Guidance of Multi-Turn Conversation in Industrial Search
Xiaoyu Li, Xiao Li, Li Gao et al.
Probabilistic Aggregation and Targeted Embedding Optimization for Collective Moral Reasoning in Large Language Models
Chenchen Yuan, Zheyu Zhang, Shuo Yang et al.
Probability-Consistent Preference Optimization for Enhanced LLM Reasoning
Yunqiao Yang, Houxing Ren, Zimu Lu et al.
ProBench: Judging Multimodal Foundation Models on Open-ended Multi-domain Expert Tasks
Yan Yang, Dongxu Li, Haoning Wu et al.
Probing LLMs for Multilingual Discourse Generalization Through a Unified Label Set
Florian Eichin, Yang Janet Liu, Barbara Plank et al.
Probing Relative Interaction and Dynamic Calibration in Multi-modal Entity Alignment
Chenxiao Li, Jingwei Cheng, Qiang Tong et al.
Probing Subphonemes in Morphology Models
Gal Astrach, Yuval Pinter