Papers
Safeguard Fine-Tuned LLMs Through Pre- and Post-Tuning Model Merging
Hua Farn, Hsuan Su, Shachi H. Kumar et al.
Butterfly Effects in Toolchains: A Comprehensive Analysis of Failed Parameter Filling in LLM Tool-Agent Systems
Qian Xiong, Yuekai Huang, Ziyou Jiang et al.
FinLFQA: Evaluating Attributed Text Generation of LLMs in Financial Long-Form Question Answering
Yitao Long, Tiansheng Hu, Yilun Zhao et al.
Zero-shot Graph Reasoning via Retrieval Augmented Framework with LLMs
Hanqing Li, Sharika Mahadevan, Kiran Jyothi Sheena et al.
Dissecting Logical Reasoning in LLMs: A Fine-Grained Evaluation and Supervision Study
Yujun Zhou, Jiayi Ye, Zipeng Ling et al.
Faster and Better LLMs via Latency-Aware Test-Time Scaling
Zili Wang, Tianyu Zhang, Haoli Bai et al.
PolBiX: Detecting LLMs’ Political Bias in Fact-Checking through X-phemisms
Charlott Jakob, David Harbecke, Patrick Parschan et al.
Low-Hallucination and Efficient Coreference Resolution with LLMs
Yujian Gan, Yuan Liang, Jinxia Xie et al.
Your Mileage May Vary: How Empathy and Demographics Shape Human Preferences in LLM Responses
Yishan Wang, Amanda Cercas Curry, Flor Miriam Plaza-del-Arco
Choosing a Model, Shaping a Future: Comparing LLM Perspectives on Sustainability and its Relationship with AI
Annika Bush, Meltem Aksoy, Markus Pauly et al.
KurTail : Kurtosis-based LLM Quantization
Mohammad Sadegh Akhondzadeh, Aleksandar Bojchevski, Evangelos Eleftheriou et al.
LLMs Reproduce Stereotypes of Sexual and Gender Minorities
Ruby Ostrow, Adam Lopez
Accept or Deny? Evaluating LLM Fairness and Performance in Loan Approval across Table-to-Text Serialization Approaches
Israel Abebe Azime, Deborah D. Kanubala, Tejumade Afonja et al.
Understanding and Improving Information Preservation in Prompt Compression for LLMs
Weronika Łajewska, Momchil Hardalov, Laura Aina et al.
Beyond Surface Alignment: Rebuilding LLMs Safety Mechanism via Probabilistically Ablating Refusal Direction
Yuanbo Xie, Yingjie Zhang, Tianyun Liu et al.
Distributed LLM Serving on Consumer-Grade GPUs by Reconciling Computation and Communication
Lewei Jin, Kui Zhang, Yongqi Chen et al.
SafeToolBench: Pioneering a Prospective Benchmark to Evaluating Tool Utilization Safety in LLMs
Hongfei Xia, Hongru Wang, Zeming Liu et al.
Beneath the Facade: Probing Safety Vulnerabilities in LLMs via Auto-Generated Jailbreak Prompts
Heehyeon Kim, Kyeongryul Lee, Joyce Jiyoung Whang
Can Role Vectors Affect LLM Behaviour?
Daniele Potertì, Andrea Seveso, Fabio Mercorio
Layer Duplication in LLMs
Neo Eyal, Nachum Dershowitz, Kfir Bar
InFact: Informativeness Alignment for Improved LLM Factuality
Roi Cohen, Russa Biswas, Gerard de Melo
Problem Solved? Information Extraction Design Space for Layout-Rich Documents using LLMs
Gaye Colakoglu, Gürkan Solmaz, Jonathan Fürst
Following Occam’s Razor: Dynamic Combination of Structured Knowledge for Multi-Hop Question Answering using LLMs
Wei Chen, Zhi Zheng, Lili Zhao et al.
AssistedDS: Benchmarking How External Domain Knowledge Assists LLMs in Automated Data Science
An Luo, Xun Xian, Jin Du et al.
No Free Lunch: Retrieval-Augmented Generation Undermines Fairness in LLMs, Even for Vigilant Users
Mengxuan Hu, Hongyi Wu, Ronghang Zhu et al.