Research Explorer

Exploring the Impact of Language Switching on Personality Traits in LLMs

Jacopo Amidei, Jose Gregorio Ferreira De Sá, Rubén Nieto Luna et al.

2025 COLING

LLMs Know What They Need: Leveraging a Missing Information Guided Framework to Empower Retrieval-Augmented Generation

Keheng Wang, Feiyu Duan, Peiguang Li et al.

2025 COLING

LLM Sensitivity Challenges in Abusive Language Detection: Instruction-Tuned vs. Human Feedback

Yaqi Zhang, Viktor Hangya, Alexander Fraser

2025 COLING

ALYMPICS: LLM Agents Meet Game Theory

Shaoguang Mao, Yuzhe Cai, Yan Xia et al.

2025 COLING

Intention Analysis Makes LLMs A Good Jailbreak Defender

Yuqi Zhang, Liang Ding, Lefei Zhang et al.

2025 COLING

Towards Understanding Multi-Task Learning (Generalization) of LLMs via Detecting and Exploring Task-Specific Neurons

Yongqi Leng, Deyi Xiong

2025 COLING

LLM Sensitivity Evaluation Framework for Clinical Diagnosis

Chenwei Yan, Xiangling Fu, Yuxuan Xiong et al.

2025 COLING

Benchmark Self-Evolving: A Multi-Agent Framework for Dynamic LLM Evaluation

Siyuan Wang, Zhuohan Long, Zhihao Fan et al.

2025 COLING

Controlling Out-of-Domain Gaps in LLMs for Genre Classification and Generated Text Detection

Dmitri Roussinov, Serge Sharoff, Nadezhda Puchnina

2025 COLING

Finetuning LLMs for Comparative Assessment Tasks

Vatsal Raina, Adian Liusie, Mark Gales

2025 COLING

Hermit Kingdom Through the Lens of Multiple Perspectives: A Case Study of LLM Hallucination on North Korea

Eunjung Cho, Won Ik Cho, Soomin Seo

2025 COLING

HLU: Human Vs LLM Generated Text Detection Dataset for Urdu at Multiple Granularities

Iqra Ali, Jesse Atuhurra, Hidetaka Kamigaito et al.

2025 COLING

ChatCite: LLM Agent with Human Workflow Guidance for Comparative Literature Summary

Yutong Li, Lu Chen, Aiwei Liu et al.

2025 COLING

Revisiting Implicitly Abusive Language Detection: Evaluating LLMs in Zero-Shot and Few-Shot Settings

Julia Jaremko, Dagmar Gromann, Michael Wiegand

2025 COLING

Can LLMs Clarify? Investigation and Enhancement of Large Language Models on Argument Claim Optimization

Yiran Wang, Ben He, Xuanang Chen et al.

2025 COLING

AraDiCE: Benchmarks for Dialectal and Cultural Capabilities in LLMs

Basel Mousi, Nadir Durrani, Fatema Ahmad et al.

2025 COLING

How Credible Is an Answer From Retrieval-Augmented LLMs? Investigation and Evaluation With Multi-Hop QA

Yujia Zhou, Zheng Liu, Zhicheng Dou

2025 COLING

Is Parameter Collision Hindering Continual Learning in LLMs?

Shuo Yang, Kun-Peng Ning, Yu-Yang Liu et al.

2025 COLING

Large Language Models are good multi-lingual learners : When LLMs meet cross-lingual prompts

Teng Wang, Zhenqi He, Wing-Yin Yu et al.

2025 COLING

QUENCH: Measuring the gap between Indic and Non-Indic Contextual General Reasoning in LLMs

Mohammad Aflah Khan, Neemesh Yadav, Sarah Masud et al.

2025 COLING

What’s the most important value? INVP: INvestigating the Value Priorities of LLMs through Decision-making in Social Scenarios

Xuelin Liu, Pengyuan Liu, Dong Yu

2025 COLING

BasqBBQ: A QA Benchmark for Assessing Social Biases in LLMs for Basque, a Low-Resource Language

Muitze Zulaika, Xabier Saralegi

2025 COLING

Interactive Evaluation for Medical LLMs via Task-oriented Dialogue System

Ruoyu Liu, Kui Xue, Xiaofan Zhang et al.

2025 COLING

Extracting structure from an LLM - how to improve on surprisal-based models of Human Language Processing

Daphne P. Wang, Mehrnoosh Sadrzadeh, Miloš Stanojević et al.

2025 COLING

What Makes Cryptic Crosswords Challenging for LLMs?

Abdelrahman Sadallah, Daria Kotova, Ekaterina Kochmar

2025 COLING

Papers