Papers
Unilogit: Robust Machine Unlearning for LLMs Using Uniform-Target Self-Distillation
Stefan Vasilev, Christian Herold, Baohao Liao et al.
Towards Safety Reasoning in LLMs: AI-agentic Deliberation for Policy-embedded CoT Data Creation
Tharindu Kumarage, Ninareh Mehrabi, Anil Ramakrishna et al.
Explain then Rank: Scale Calibration of Neural Rankers Using Natural Language Explanations from LLMs
Puxuan Yu, Daniel Cohen, Hemank Lamba et al.
Evaluating LLMs’ Mathematical and Coding Competency through Ontology-guided Interventions
Pengfei Hong, Navonil Majumder, Deepanway Ghosal et al.
GSQ-Tuning: Group-Shared Exponents Integer in Fully Quantized Training for LLMs On-Device Fine-tuning
Sifan Zhou, Shuo Wang, Zhihang Yuan et al.
Evaluation of LLMs in Medical Text Summarization: The Role of Vocabulary Adaptation in High OOV Settings
Gunjan Balde, Soumyadeep Roy, Mainack Mondal et al.
Dynamic Personality in LLM Agents: A Framework for Evolutionary Modeling and Behavioral Analysis in the Prisoner’s Dilemma
Weiqi Zeng, Bo Wang, Dongming Zhao et al.
Are the Values of LLMs Structurally Aligned with Humans? A Causal Perspective
Yipeng Kang, Junqi Wang, Yexin Li et al.
LLMs Can Also Do Well! Breaking Barriers in Semantic Role Labeling via Large Language Models
Xinxin Li, Huiyao Chen, Chengjun Liu et al.
Edit Once, Update Everywhere: A Simple Framework for Cross-Lingual Knowledge Synchronization in LLMs
Yuchen Wu, Liang Ding, Li Shen et al.
The Law of Knowledge Overshadowing: Towards Understanding, Predicting and Preventing LLM Hallucination
Yuji Zhang, Sha Li, Cheng Qian et al.
Un-considering Contextual Information: Assessing LLMs’ Understanding of Indexical Elements
Metehan Oğuz, Yavuz Faruk Bakman, Duygu Nur Yaldiz
The Behavior Gap: Evaluating Zero-shot LLM Agents in Complex Task-Oriented Dialogs
Avinash Baidya, Kamalika Das, Xiang Gao
LLM as Effective Streaming Processor: Bridging Streaming-Batch Mismatches with Group Position Encoding
Junlong Tong, Jinlan Fu, Zixuan Lin et al.
Revisiting 3D LLM Benchmarks: Are We Really Testing 3D Capabilities?
Jiahe Jin, Yanheng He, Mingyan Yang
Relevant or Random: Can LLMs Truly Perform Analogical Reasoning?
Chengwei Qin, Wenhan Xia, Tan Wang et al.
KaFT: Knowledge-aware Fine-tuning for Boosting LLMs’ Domain-specific Question-Answering Performance
Qihuang Zhong, Liang Ding, Xiantao Cai et al.
A rebuttal of two common deflationary stances against LLM cognition
Zak Hussain, Rui Mata, Dirk U. Wulff
COVER: Context-Driven Over-Refusal Verification in LLMs
Giovanni Sullutrone, Riccardo A. Vigliermo, Sonia Bergamaschi et al.
Missing the Margins: A Systematic Literature Review on the Demographic Representativeness of LLMs
Indira Sen, Marlene Lutz, Elisa Rogers et al.
LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs
Omkar Thawakar, Dinura Dissanayake, Ketan Pravin More et al.
When Benchmarks Talk: Re-Evaluating Code LLMs with Interactive Feedback
Jane Pan, Ryan Shar, Jacob Pfau et al.
Token-Budget-Aware LLM Reasoning
Tingxu Han, Zhenting Wang, Chunrong Fang et al.
TituLLMs: A Family of Bangla LLMs with Comprehensive Benchmarking
Shahriar Kabir Nahin, Rabindra Nath Nandi, Sagor Sarker et al.
Let’s Fuse Step by Step: A Generative Fusion Decoding Algorithm with LLMs for Robust and Instruction-Aware ASR and OCR
Chan-Jan Hsu, Yi-Chang Chen, Feng-Ting Liao et al.