Papers
HARMONIC: Harnessing LLMs for Tabular Data Synthesis and Privacy Protection
Yuxin Wang, Duanyu Feng, Yongfu Dai et al.
QuaRot: Outlier-Free 4-Bit Inference in Rotated LLMs
Saleh Ashkboos, Amirkeivan Mohtashami, Maximilian L. Croci et al.
Embodied Agent Interface: Benchmarking LLMs for Embodied Decision Making
Manling Li, Shiyu Zhao, Qineng Wang et al.
HYDRA: Model Factorization Framework for Black-Box LLM Personalization
Yuchen Zhuang, Haotian Sun, Yue Yu et al.
Transfer Q-star : Principled Decoding for LLM Alignment
Souradip Chakraborty, Soumya Suvra Ghosal, Ming Yin et al.
UniBias: Unveiling and Mitigating LLM Bias through Internal Attention and FFN Manipulation
Hanzhang Zhou, Zijian Feng, Zixiao Zhu et al.
HaloScope: Harnessing Unlabeled LLM Generations for Hallucination Detection
Xuefeng Du, Chaowei Xiao, Yixuan Li
$\texttt{ConflictBank}$: A Benchmark for Evaluating the Influence of Knowledge Conflicts in LLMs
Zhaochen Su, Jun Zhang, Xiaoye Qu et al.
Reinforcing LLM Agents via Policy Optimization with Action Decomposition
Muning Wen, Ziyu Wan, Jun Wang et al.
Distributional Preference Alignment of LLMs via Optimal Transport
Igor Melnyk, Youssef Mroueh, Brian Belgodere et al.
BABILong: Testing the Limits of LLMs with Long Context Reasoning-in-a-Haystack
Yuri Kuratov, Aydar Bulatov, Petr Anokhin et al.
LLM Processes: Numerical Predictive Distributions Conditioned on Natural Language
James Requeima, John Bronskill, Dami Choi et al.
WikiContradict: A Benchmark for Evaluating LLMs on Real-World Knowledge Conflicts from Wikipedia
Yufang Hou, Alessandra Pascale, Javier Carnerero-Cano et al.
Cooperate or Collapse: Emergence of Sustainable Cooperation in a Society of LLM Agents
Giorgio Piatti, Zhijing Jin, Max Kleiman-Weiner et al.
Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs
Sukmin Yun, Haokun Lin, Rusiru Thushara et al.
When LLMs Meet Cunning Texts: A Fallacy Understanding Benchmark for Large Language Models
Yinghui Li, Qingyu Zhou, Yuanzhen Luo et al.
NoMAD-Attention: Efficient LLM Inference on CPUs Through Multiply-add-free Attention
Tianyi Zhang, Jonah Yi, Bowen Yao et al.
ArkVale: Efficient Generative LLM Inference with Recallable Key-Value Eviction
Renze Chen, Zhuofeng Wang, Beiquan Cao et al.
Decision-Making Behavior Evaluation Framework for LLMs under Uncertain Context
Jingru Jia, Zehua Yuan, Junhao Pan et al.
CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs
Zirui Wang, Mengzhou Xia, Luxi He et al.
$\textit{Read-ME}$: Refactorizing LLMs as Router-Decoupled Mixture of Experts with System Co-Design
Ruisi Cai, Yeonju Ro, Geon-Woo Kim et al.
Code Repair with LLMs gives an Exploration-Exploitation Tradeoff
Hao Tang, Keya Hu, Jin Peng Zhou et al.
Keeping LLMs Aligned After Fine-tuning: The Crucial Role of Prompt Templates
Kaifeng Lyu, Haoyu Zhao, Xinran Gu et al.
MR-Ben: A Meta-Reasoning Benchmark for Evaluating System-2 Thinking in LLMs
Zhongshen Zeng, Yinhong Liu, Yingjia Wan et al.
InfLLM: Training-Free Long-Context Extrapolation for LLMs with an Efficient Context Memory
Chaojun Xiao, Pengle Zhang, Xu Han et al.