Papers
Probabilistic Soundness Guarantees in LLM Reasoning Chains
Weiqiu You, Anton Xue, Shreya Havaldar et al.
Probing and Boosting Large Language Models Capabilities via Attention Heads
Dezhi Zhao, Xin Liu, Xiaocheng Feng et al.
Probing for Arithmetic Errors in Language Models
Yucheng Sun, Alessandro Stolfo, Mrinmaya Sachan
Probing Gender Bias in Multilingual LLMs: A Case Study of Stereotypes in Persian
Ghazal Kalhor, Behnam Bahrak
Probing LLM World Models: Enhancing Guesstimation with Wisdom of Crowds Decoding
Yun-Shiuan Chuang, Sameer Narendran, Nikunj Harlalka et al.
Probing Logical Reasoning of MLLMs in Scientific Diagrams
Yufei Wang, Adriana Kovashka
Probing Narrative Morals: A New Character-Focused MFT Framework for Use with Large Language Models
Luca Mitran, Sophie Wu, Andrew Piper
Probing Semantic Routing in Large Mixture-of-Expert Models
Matthew Lyle Olson, Neale Ratzlaff, Musashi Hinck et al.
Probing the Limits of Multilingual Language Understanding: Low-Resource Language Proverbs as LLM Benchmark for AI Wisdom
Surendrabikram Thapa, Kritesh Rauniyar, Hariram Veeramani et al.
Problem Solved? Information Extraction Design Space for Layout-Rich Documents using LLMs
Gaye Colakoglu, Gürkan Solmaz, Jonathan Fürst
Procedural Environment Generation for Tool-Use Agents
Michael Sullivan, Mareike Hartmann, Alexander Koller
Process-Supervised Reinforcement Learning for Code Generation
Yufan Ye, Ting Zhang, Wenbin Jiang et al.
Process-Supervised Reward Models for Verifying Clinical Note Generation: A Scalable Approach Guided by Domain Expertise
Hanyin Wang, Chufan Gao, Qiping Xu et al.
ProCut: LLM Prompt Compression via Attribution Estimation
Zhentao Xu, Fengyi Li, Albert C. Chen et al.
ProcVQA: Benchmarking the Effects of Structural Properties in Mined Process Visualizations on Vision–Language Model Performance
Kazi Tasnim Zinat, Saad Mohammad Abrar, Shoumik Saha et al.
ProcWorld: Benchmarking Large Model Planning in Reachability-Constrained Environments
Dong Wang, Xinghang Li, Zhengshen Zhang et al.
ProductAgent: Benchmarking Conversational Product Search Agent with Asking Clarification Questions
Jingheng Ye, Yong Jiang, Xiaobin Wang et al.
Profiler: Black-box AI-generated Text Origin Detection via Context-aware Inference Pattern Analysis
Hanxi Guo, Siyuan Cheng, Xiaolong Jin et al.
Profiling LLM’s Copyright Infringement Risks under Adversarial Persuasive Prompting
Jikai Long, Ming Liu, Xiusi Chen et al.
Program of Thoughts for Financial Reasoning: Leveraging Dynamic In-Context Examples and Generative Retrieval
Subhendu Khatuya, Shashwat Naidu, Pawan Goyal et al.
Progressive Facial Granularity Aggregation with Bilateral Attribute-based Enhancement for Face-to-Speech Synthesis
Yejin Jeon, Youngjae Kim, Jihyun Lee et al.
ProLongVid: A Simple but Strong Baseline for Long-context Video Instruction Tuning
Rui Wang, Bohao Li, Xiyang Dai et al.
Promote, Suppress, Iterate: How Language Models Answer One-to-Many Factual Queries
Tianyi Lorena Yan, Robin Jia