Papers
How Alignment and Jailbreak Work: Explain LLM Safety through Intermediate Hidden States
Zhenhong Zhou, Haiyang Yu, Xinghua Zhang et al.
Enhancing Healthcare LLM Trust with Atypical Presentations Recalibration
Jeremy Qin, Bang Liu, Quoc Dinh Nguyen
Divide-or-Conquer? Which Part Should You Distill Your LLM?
Zhuofeng Wu, Richard He Bai, Aonan Zhang et al.
PSLM: Parallel Generation of Text and Speech with LLMs for Low-Latency Spoken Dialogue Systems
Kentaro Mitsui, Koh Mitsuda, Toshiaki Wakatsuki et al.
Are Large Language Models (LLMs) Good Social Predictors?
Kaiqi Yang, Hang Li, Hongzhi Wen et al.
Enhancing Temporal Modeling of Video LLMs via Time Gating
Zi-Yuan Hu, Yiwu Zhong, Shijia Huang et al.
On the Empirical Complexity of Reasoning and Planning in LLMs
Liwei Kang, Zirui Zhao, David Hsu et al.
Characterizing LLM Abstention Behavior in Science QA with Context Perturbations
Bingbing Wen, Bill Howe, Lucy Lu Wang
NormTab: Improving Symbolic Reasoning in LLMs Through Tabular Data Normalization
Md Mahadi Hasan Nahid, Davood Rafiei
Molecular Facts: Desiderata for Decontextualization in LLM Fact Verification
Anisha Gunjal, Greg Durrett
UniSumEval: Towards Unified, Fine-grained, Multi-dimensional Summarization Evaluation for LLMs
Yuho Lee, Taewon Yun, Jason Cai et al.
The Fall of ROME: Understanding the Collapse of LLMs in Model Editing
Wanli Yang, Fei Sun, Jiajun Tan et al.
OneGen: Efficient One-Pass Unified Generation and Retrieval for LLMs
Jintian Zhang, Cheng Peng, Mengshu Sun et al.
Evaluating Moral Beliefs across LLMs through a Pluralistic Framework
Xuelin Liu, Yanfei Zhu, Shucheng Zhu et al.
In Defense of Structural Sparse Adapters for Concurrent LLM Serving
Junda Su, Zirui Liu, Zeju Qiu et al.
Learning to Ask Informative Questions: Enhancing LLMs with Preference Optimization and Expected Information Gain
Davide Mazzaccara, Alberto Testoni, Raffaella Bernardi
Shall We Team Up: Exploring Spontaneous Cooperation of Competing LLM Agents
Zengqing Wu, Run Peng, Shuyuan Zheng et al.
From Test-Taking to Test-Making: Examining LLM Authoring of Commonsense Assessment Items
Melissa Roemmele, Andrew Gordon
Using RL to Identify Divisive Perspectives Improves LLMs Abilities to Identify Communities on Social Media
Nikhil Mehta, Dan Goldwasser
Calibrating LLMs with Preference Optimization on Thought Trees for Generating Rationale in Science Question Scoring
Jiazheng Li, Hainiu Xu, Zhaoyue Sun et al.
Toolken+: Improving LLM Tool Usage with Reranking and a Reject Option
Konstantin Yakovlev, Sergey Nikolenko, Andrey Bout
Can LLMs Recognize Toxicity? A Structured Investigation Framework and Toxicity Metric
Hyukhun Koh, Dohyung Kim, Minwoo Lee et al.
How Reliable Are Automatic Evaluation Methods for Instruction-Tuned LLMs?
Ehsan Doostmohammadi, Oskar Holmström, Marco Kuhlmann
VideoINSTA: Zero-shot Long Video Understanding via Informative Spatial-Temporal Reasoning with LLMs
Ruotong Liao, Max Erler, Huiyu Wang et al.
CEAMC: Corpus and Empirical Study of Argument Analysis in Education via LLMs
Yupei Ren, Hongyi Wu, Zhaoguang Long et al.