Papers
SKA-Bench: A Fine-Grained Benchmark for Evaluating Structured Knowledge Understanding of LLMs
Zhiqiang Liu, Enpei Niu, Yin Hua et al.
From Implicit Exploration to Structured Reasoning: Guideline and Refinement for LLMs
Jiaxiang Chen, Zhuo Wang, Mingxi Zou et al.
MPO: Boosting LLM Agents with Meta Plan Optimization
Weimin Xiong, Yifan Song, Qingxiu Dong et al.
FlexQuant: A Flexible and Efficient Dynamic Precision Switching Framework for LLM Quantization
Fangxin Liu, Zongwu Wang, Jinhong Xia et al.
Benchmarking Uncertainty Metrics for LLM Target-Aware Search
Pei-Fu Guo, Yun-Da Tsai, Shou-De Lin
Recipe2Plan: Evaluating Planning Abilities of LLMs for Efficient and Feasible Multitasking with Time Constraints Between Actions
Zirui Wu, Xiao Liu, Jiayi Li et al.
Can Multimodal LLMs See Materials Clearly? A Multimodal Benchmark on Materials Characterization
Zhengzhao Lai, Youbin Zheng, Zhenyang Cai et al.
Cross-Cultural Transfer of Commonsense Reasoning in LLMs: Evidence from the Arab World
Saeed Almheiri, Rania Elbadry, Mena Attia et al.
Forewarned is Forearmed: Pre-Synthesizing Jailbreak-like Instructions to Enhance LLM Safety Guardrail to Potential Attacks
Sheng Liu, Qiang Sheng, Danding Wang et al.
GenPoE: Generative Passage-level Mixture of Experts for Knowledge Enhancement of LLMs
Xuebing Liu, Shanbao Qiao, Seung-Hoon Na
X-Boundary: Establishing Exact Safety Boundary to Shield LLMs from Jailbreak Attacks without Compromising Usability
Xiaoya Lu, Dongrui Liu, Yi Yu et al.
The “r” in “woman” stands for rights. Auditing LLMs in Uncovering Social Dynamics in Implicit Misogyny
Arianna Muti, Chris Emmery, Debora Nozza et al.
LLMs are Privacy Erasable
Zipeng Ye, Wenjian Luo
DeAR: Dual-Stage Document Reranking with Reasoning Agents via LLM Distillation
Abdelrahman Abdallah, Jamshid Mozafari, Bhawna Piryani et al.
CANDY: Benchmarking LLMs’ Limitations and Assistive Potential in Chinese Misinformation Fact-Checking
Ruiling Guo, Xinwei Yang, Chen Huang et al.
LLM Jailbreak Detection for (Almost) Free!
Guorui Chen, Yifan Xia, Xiaojun Jia et al.
Plugging Schema Graph into Multi-Table QA: A Human-Guided Framework for Reducing LLM Reliance
Xixi Wang, Miguel Costa, Jordanka Kovaceva et al.
Constructing Your Model’s Value Distinction: Towards LLM Alignment with Anchor Words Tuning
Zhen Yang, Ping Jian, Chengzhi Li et al.
Do LLMs Know and Understand Domain Conceptual Knowledge?
Sijia Shen, Feiyan Jiang, Peiyan Wang et al.
Agent Laboratory: Using LLM Agents as Research Assistants
Samuel Schmidgall, Yusheng Su, Ze Wang et al.
Regularized Contrastive Decoding with Hard Negative Samples for LLM Hallucination Mitigation
Haonan Sheng, Dou Hu, Lingwei Wei et al.
OSC: Cognitive Orchestration through Dynamic Knowledge Alignment in Multi-Agent LLM Collaboration
Jusheng Zhang, Yijia Fan, Kaitong Cai et al.
Can LLMs Find a Needle in a Haystack? A Look at Anomaly Detection Language Modeling
Leslie Barrett, Vikram Sunil Bajaj, Robert John Kingan
SIFT: Grounding LLM Reasoning in Contexts via Stickers
Zihao Zeng, Xuyao Huang, Boxiu Li et al.
LUME: LLM Unlearning with Multitask Evaluations
Anil Ramakrishna, Yixin Wan, Xiaomeng Jin et al.