Papers
2,781 papers found
Benchmarking Contextual and Paralinguistic Reasoning in Speech-LLMs: A Case Study with In-the-Wild Data
Qiongqiong Wang, Hardik Bhupendra Sailor, Tianchi Liu et al.
ReCUT: Balancing Reasoning Length and Accuracy in LLMs via Stepwise Trails and Preference Optimization
Zhensheng Jin, Xinze Li, Yifan Ji et al.
Training with Fewer Bits: Unlocking Edge LLMs Training with Stochastic Rounding
Taowen Liu, Marta Andronic, Deniz Gunduz et al.
Elucidating Mechanisms of Demographic Bias in LLMs for Healthcare
Hiba Ahsan, Arnab Sen Sharma, Silvio Amir et al.
Trust Me, I’m Wrong: LLMs Hallucinate with Certainty Despite Knowing the Answer
Adi Simhi, Itay Itzhak, Fazl Barez et al.
Evaluating the Creativity of LLMs in Persian Literary Text Generation
Armin Tourajmehr, Mohammad Reza Modarres, Yadollah Yaghoobzadeh
“Going to a trap house” conveys more fear than “Going to a mall”: Benchmarking Emotion Context Sensitivity for LLMs
Eojin Jeon, Mingyu Lee, Sangyun Kim et al.
Neutral Is Not Unbiased: Evaluating Implicit and Intersectional Identity Bias in LLMs Through Structured Narrative Scenarios
Saba Ghanbari Haez, Mauro Dragoni
Can LLMs Be Efficient Predictors of Conversational Derailment?
Kaustubh Olpadkar, Vikram Sunil Bajaj, Leslie Barrett
Dropping Experts, Recombining Neurons: Retraining-Free Pruning for Sparse Mixture-of-Experts LLMs
Yixiao Zhou, Ziyu Zhao, Dongzhou Cheng et al.
LLMs Can Compensate for Deficiencies in Visual Representations
Sho Takishita, Jay Gala, Abdelrahman Mohamed et al.
Exploring Paraphrasing Strategies for CEFR A1-Level Constraints in LLMs
Eugenio Marzona, Maria Goikhman, Alessio Palmero Aprosio et al.
ULTRABENCH: Benchmarking LLMs under Extreme Fine-grained Text Generation
Longfei Yun, Letian Peng, Jingbo Shang
The Price of Format: Diversity Collapse in LLMs
Longfei Yun, Chenyang An, Zilong Wang et al.
LLMs for Bayesian Optimization in Scientific Domains: Are We There Yet?
Rushil Gupta, Jason Hartford, Bang Liu
MFTCXplain: A Multilingual Benchmark Dataset for Evaluating the Moral Reasoning of LLMs through Multi-hop Hate Speech Explanation
Jackson Trager, Francielle Vargas, Diego Alves et al.
Fine-tuning LLMs with Cross-Attention-based Weight Decay for Bias Mitigation
Farsheed Haque, Zhe Fu, Depeng Xu et al.
Post-hoc Study of Climate Microtargeting on Social Media Ads with LLMs: Thematic Insights and Fairness Evaluation
Tunazzina Islam, Dan Goldwasser
FSTs vs ICL: Generalisation in LLMs for an under-resourced language
Ximena Gutierrez, Mikel Segura Elizalde, Victor Mijangos
SRM-LLM: Semantic Relationship Mining with LLMs for Temporal Knowledge Graph Extrapolation
Fu Zhang, Panfeng Zhang, Jingwei Cheng
Bridging the Creativity Understanding Gap: Small-Scale Human Alignment Enables Expert-Level Humor Ranking in LLMs
Kuan Lok Zhou, Jiayi Chen, Siddharth Suresh et al.
DrKGC: Dynamic Subgraph Retrieval-Augmented LLMs for Knowledge Graph Completion across General and Biomedical Domains
Yongkang Xiao, Sinian Zhang, Yi Dai et al.
Safeguard Fine-Tuned LLMs Through Pre- and Post-Tuning Model Merging
Hua Farn, Hsuan Su, Shachi H. Kumar et al.
FinLFQA: Evaluating Attributed Text Generation of LLMs in Financial Long-Form Question Answering
Yitao Long, Tiansheng Hu, Yilun Zhao et al.
Zero-shot Graph Reasoning via Retrieval Augmented Framework with LLMs
Hanqing Li, Sharika Mahadevan, Kiran Jyothi Sheena et al.