Papers
2,781 papers found
Evaluating Cultural Knowledge and Reasoning in LLMs Through Persian Allusions
Melika Nobakhtian, Yadollah Yaghoobzadeh, Mohammad Taher Pilehvar
Saudi-Alignment Benchmark: Assessing LLMs Alignment with Cultural Norms and Domain Knowledge in the Saudi Context
Manal Alhassoun, Imaan Mohammed Alkhanen, Nouf Alshalawi et al.
AraHalluEval: A Fine-grained Hallucination Evaluation Framework for Arabic LLMs
Aisha Alansari, Hamzah Luqman
Can LLMs Directly Retrieve Passages for Answering Questions from Qur’an?
Sohaila Eltanbouly, Salam Albatarni, Shaimaa Hassanein et al.
Zero-Shot and Fine-Tuned Evaluation of Generative LLMs for Arabic Word Sense Disambiguation
Yossra Noureldien, Abdelrazig Mohamed, Farah Attallah
Bridging Dialectal Gaps in Arabic Medical LLMs through Model Merging
Ahmed Ibrahim, Abdullah Hosseini, Hoda Helmy et al.
Tool Calling for Arabic LLMs: Data Strategies and Instruction Tuning
Asım Ersoy, Enes Altinisik, Kareem Mohamed Darwish et al.
IslamicEval 2025: The First Shared Task of Capturing LLMs Hallucination in Islamic Content
Hamdy Mubarak, Rana Malhas, Watheq Mansour et al.
PalmX 2025: The First Shared Task on Benchmarking LLMs on Arabic and Islamic Culture
Fakhraddin Alwajih, Abdellah El Mekki, Hamdy Mubarak et al.
What did you say? Generating Child-Directed Speech Questions to Train LLMs
Whitney Poh, Michael Tombolini, Libby Barak
Char-mander Use mBackdoor! A Study of Cross-lingual Backdoor Attacks in Multilingual LLMs
Himanshu Beniwal, Sailesh Panda, Birudugadda Srivibhav et al.
The Lookahead Limitation: Why Multi-Operand Addition is Hard for LLMs
Tanja Baeumel, Josef Van Genabith, Simon Ostermann
Can LLMs Detect Ambiguous Plural Reference? An Analysis of Split-Antecedent and Mereological Reference
Dang Thi Thao Anh, Rick Nouwen, Massimo Poesio
From BERT to LLMs: Comparing and Understanding Chinese Classifier Prediction in Language Models
Ziqi Zhang, Jianfei Ma, Emmanuele Chersoni et al.
What Features in Prompts Jailbreak LLMs? Investigating the Mechanisms Behind Attacks
Nathalie Maria Kirch, Constantin Niko Weisser, Severin Field et al.
Zero-Shot Belief: A Hard Problem for LLMs
John Murzaku, Owen Rambow
Mention detection with LLMs in pair-programming dialogue
Cecilia Domingo, Paul Piwek, Svetlana Stoyanchev et al.
Findings of the Fourth Shared Task on Multilingual Coreference Resolution: Can LLMs Dethrone Traditional Approaches?
Michal Novák, Miloslav Konopik, Anna Nedoluzhko et al.
Rethinking Search: A Study of University Students’ Perspectives on Using LLMs and Traditional Search Engines in Academic Problem Solving
Md. Faiyaz Abdullah Sayeedi, Md. Sadman Haque, Zobaer Ibn Razzaque et al.
Culturally-Aware Conversations: A Framework & Benchmark for LLMs
Shreya Havaldar, Young Min Cho, Sunny Rai et al.
MEETING DELEGATE: Benchmarking LLMs on Attending Meetings on Our Behalf
Lingxiang Hu, Shurun Yuan, Xiaoting Qin et al.
Syntactic Blind Spots: How Misalignment Leads to LLMs’ Mathematical Errors
Dane A Williamson, Yangfeng Ji, Matthew B. Dwyer
CoCo-CoLa: Evaluating and Improving Language Adherence in Multilingual LLMs
Elnaz Rahmati, Alireza Salkhordeh Ziabari, Morteza Dehghani
The Unreasonable Effectiveness of Model Merging for Cross-Lingual Transfer in LLMs
Lucas Bandarkar, Nanyun Peng
Reassessing Speech Translation for Low-Resource Languages: Do LLMs Redefine the State-of-the-Art Against Cascaded Models?
Jonah Dauvet, Min Ma, Jessica Ojo et al.