Papers
2,781 papers found
Emphasising Structured Information: Integrating Abstract Meaning Representation into LLMs for Enhanced Open-Domain Dialogue Evaluation
Bohao Yang, Kun Zhao, Dong Liu et al.
Crafting Customisable Characters with LLMs: A Persona-Driven Role-Playing Agent Framework
Bohao Yang, Dong Liu, Chenghao Xiao et al.
Can LLMs Express Personality Across Cultures? Introducing CulturalPersonas for Evaluating Trait Alignment
Priyanka Dey, Aayush Bothra, Yugal Khanter et al.
When Punctuation Matters: A Large-Scale Comparison of Prompt Robustness Methods for LLMs
Mikhail Seleznyov, Mikhail Chaichuk, Gleb Ershov et al.
QA‐LIGN: Aligning LLMs through Constitutionally Decomposed QA
Jacob Dineen, Aswin Rrv, Qin Liu et al.
Pruning Weights but Not Truth: Safeguarding Truthfulness While Pruning LLMs
Yao Fu, Runchao Li, Xianxuan Long et al.
SCoder: Progressive Self-Distillation for Bootstrapping Small-Scale Data Synthesizers to Empower Code LLMs
Xinyu Zhang, Changzhi Zhou, Linmei Hu et al.
Analyzing Dialectical Biases in LLMs for Knowledge and Reasoning Benchmarks
Eileen Pan, Anna Seo Gyeong Choi, Maartje Ter Hoeve et al.
Humanity’s Last Code Exam: Can Advanced LLMs Conquer Human’s Hardest Code Competition?
Xiangyang Li, Xiaopeng Li, Kuicai Dong et al.
Can LLMs Judge Debates? Evaluating Non-Linear Reasoning via Argumentation Theory Semantics
Reza Sanayei, Srdjan Vesic, Eduardo Blanco et al.
LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via a Hybrid Architecture
Xidong Wang, Dingjie Song, Shunian Chen et al.
Can Code-Switched Texts Activate a Knowledge Switch in LLMs? A Case Study on English-Korean Code-Switching
Seoyeon Kim, Huiseo Kim, Chanjun Park et al.
MAKIEval: A Multilingual Automatic WiKidata-based Framework for Cultural Awareness Evaluation for LLMs
Raoyuan Zhao, Beiduo Chen, Barbara Plank et al.
Do We Know What LLMs Don’t Know? A Study of Consistency in Knowledge Probing
Raoyuan Zhao, Abdullatif Köksal, Ali Modarressi et al.
Unequal Scientific Recognition in the Age of LLMs
Yixuan Liu, Abel Elekes, Jianglin Lu et al.
Using tournaments to calculate AUROC for zero-shot classification with LLMs
WonJin Yoon, Ian Bulovic, Timothy A. Miller
FaStFact: Faster, Stronger Long-Form Factuality Evaluations in LLMs
Yingjia Wan, Haochen Tan, Xiao Zhu et al.
PropXplain: Can LLMs Enable Explainable Propaganda Detection?
Maram Hasanain, Md Arid Hasan, Mohamed Bayan Kmainasi et al.
Efficient Latent Semantic Clustering for Scaling Test-Time Computation of LLMs
Sungjae Lee, Hoyoung Kim, Jeongyeon Hwang et al.
Under the Shadow of Babel: How Language Shapes Reasoning in LLMs
Chenxi Wang, Yixuan Zhang, Lang Gao et al.
Exploring Context Strategies in LLMs for Discourse-Aware Machine Translation
Ritvik Choudhary, Rem Hida, Masaki Hamada et al.
DIPLomA: Efficient Adaptation of Instructed LLMs to Low-Resource Languages via Post-Training Delta Merging
Ixak Sarasua, Ander Corral, Xabier Saralegi
Are LLMs Empathetic to All? Investigating the Influence of Multi-Demographic Personas on a Model’s Empathy
Ananya Malik, Nazanin Sabri, Melissa M. Karnaze et al.
SparsePO: Controlling Preference Alignment of LLMs via Sparse Token Masks
Fenia Christopoulou, Ronald Cardenas, Gerasimos Lampouras et al.
Can We Edit LLMs for Long-Tail Biomedical Knowledge?
Xinhao Yi, Jake Lever, Kevin Bryson et al.