Papers
2,781 papers found
QuaRot: Outlier-Free 4-Bit Inference in Rotated LLMs
Saleh Ashkboos, Amirkeivan Mohtashami, Maximilian L. Croci et al.
Embodied Agent Interface: Benchmarking LLMs for Embodied Decision Making
Manling Li, Shiyu Zhao, Qineng Wang et al.
$\texttt{ConflictBank}$: A Benchmark for Evaluating the Influence of Knowledge Conflicts in LLMs
Zhaochen Su, Jun Zhang, Xiaoye Qu et al.
Distributional Preference Alignment of LLMs via Optimal Transport
Igor Melnyk, Youssef Mroueh, Brian Belgodere et al.
BABILong: Testing the Limits of LLMs with Long Context Reasoning-in-a-Haystack
Yuri Kuratov, Aydar Bulatov, Petr Anokhin et al.
WikiContradict: A Benchmark for Evaluating LLMs on Real-World Knowledge Conflicts from Wikipedia
Yufang Hou, Alessandra Pascale, Javier Carnerero-Cano et al.
Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs
Sukmin Yun, Haokun Lin, Rusiru Thushara et al.
When LLMs Meet Cunning Texts: A Fallacy Understanding Benchmark for Large Language Models
Yinghui Li, Qingyu Zhou, Yuanzhen Luo et al.
Decision-Making Behavior Evaluation Framework for LLMs under Uncertain Context
Jingru Jia, Zehua Yuan, Junhao Pan et al.
CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs
Zirui Wang, Mengzhou Xia, Luxi He et al.
$\textit{Read-ME}$: Refactorizing LLMs as Router-Decoupled Mixture of Experts with System Co-Design
Ruisi Cai, Yeonju Ro, Geon-Woo Kim et al.
Code Repair with LLMs gives an Exploration-Exploitation Tradeoff
Hao Tang, Keya Hu, Jin Peng Zhou et al.
Keeping LLMs Aligned After Fine-tuning: The Crucial Role of Prompt Templates
Kaifeng Lyu, Haoyu Zhao, Xinran Gu et al.
MR-Ben: A Meta-Reasoning Benchmark for Evaluating System-2 Thinking in LLMs
Zhongshen Zeng, Yinhong Liu, Yingjia Wan et al.
InfLLM: Training-Free Long-Context Extrapolation for LLMs with an Efficient Context Memory
Chaojun Xiao, Pengle Zhang, Xu Han et al.
RankRAG: Unifying Context Ranking with Retrieval-Augmented Generation in LLMs
Yue Yu, Wei Ping, Zihan Liu et al.
Crafting Interpretable Embeddings for Language Neuroscience by Asking LLMs Questions
Vinamra Benara, Chandan Singh, John X. Morris et al.
Implicit Multimodal Alignment: On the Generalization of Frozen LLMs to Multimodal Inputs
Mustafa Shukor, Matthieu Cord
Can LLMs Solve Molecule Puzzles? A Multimodal Benchmark for Molecular Structure Elucidation
Kehan Guo, Bozhao Nan, Yujun Zhou et al.
Truth is Universal: Robust Detection of Lies in LLMs
Lennart Bürger, Fred A. Hamprecht, Boaz Nadler
Connecting the Dots: LLMs can Infer and Verbalize Latent Structure from Disparate Training Data
Johannes Treutlein, Dami Choi, Jan Betley et al.
CLUES: Collaborative Private-domain High-quality Data Selection for LLMs via Training Dynamics
Wanru Zhao, Hongxiang Fan, Shell Xu Hu et al.
Repair Is Nearly Generation: Multilingual Program Repair with LLMs
Harshit Joshi, José Cambronero Sanchez, Sumit Gulwani et al.
Generating Novel Leads for Drug Discovery Using LLMs with Logical Feedback
Shreyas Bhat Brahmavar, Ashwin Srinivasan, Tirtharaj Dash et al.
Omnipotent Distillation with LLMs for Weakly-Supervised Natural Language Video Localization: When Divergence Meets Consistency
Peijun Bao, Zihao Shao, Wenhan Yang et al.