conftrace_

Papers

5,479 papers found · 435 more without abstracts hidden Show all

Humanity’s Last Code Exam: Can Advanced LLMs Conquer Human’s Hardest Code Competition?

Xiangyang Li, Xiaopeng Li, Kuicai Dong et al.

2025 EMNLP

Can LLMs Judge Debates? Evaluating Non-Linear Reasoning via Argumentation Theory Semantics

Reza Sanayei, Srdjan Vesic, Eduardo Blanco et al.

2025 EMNLP

LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via a Hybrid Architecture

Xidong Wang, Dingjie Song, Shunian Chen et al.

2025 EMNLP

CLAIMCHECK: How Grounded are LLM Critiques of Scientific Papers?

Jiefu Ou, William Walden, Kate Sanders et al.

2025 EMNLP

KoACD: The First Korean Adolescent Dataset for Cognitive Distortion Analysis via Role-Switching Multi-LLM Negotiation

Jun Seo Kim, Hye Hyeon Kim

2025 EMNLP

Temporal Consistency for LLM Reasoning Process Error Identification

Jiacheng Guo, Yue Wu, Jiahao Qiu et al.

2025 EMNLP

Presumed Cultural Identity: How Names Shape LLM Responses

Siddhesh Milind Pawar, Arnav Arora, Lucie-Aimée Kaffee et al.

2025 EMNLP

Can Code-Switched Texts Activate a Knowledge Switch in LLMs? A Case Study on English-Korean Code-Switching

Seoyeon Kim, Huiseo Kim, Chanjun Park et al.

2025 EMNLP

Challenging the Evaluator: LLM Sycophancy Under User Rebuttal

Sung Won Kim, Daniel Khashabi

2025 EMNLP

Quantifying the Risks of LLM- and Tool-assisted Rephrasing to Linguistic Diversity

Mengying Wang, Andreas Spitz

2025 EMNLP

DORM: Preference Data Weights Optimization for Reward Modeling in LLM Alignment

Rongzhi Zhang, Chenwei Zhang, Xinyang Zhang et al.

2025 EMNLP

From Insight to Exploit: Leveraging LLM Collaboration for Adaptive Adversarial Text Generation

Najrin Sultana, Md Rafi Ur Rashid, Kang Gu et al.

2025 EMNLP

Instability in Downstream Task Performance During LLM Pretraining

Yuto Nishida, Masaru Isonuma, Yusuke Oda

2025 EMNLP

MAKIEval: A Multilingual Automatic WiKidata-based Framework for Cultural Awareness Evaluation for LLMs

Raoyuan Zhao, Beiduo Chen, Barbara Plank et al.

2025 EMNLP

AGENTVIGIL: Automatic Black-Box Red-teaming for Indirect Prompt Injection against LLM Agents

Zhun Wang, Vincent Siu, Zhe Ye et al.

2025 EMNLP

Do We Know What LLMs Don’t Know? A Study of Consistency in Knowledge Probing

Raoyuan Zhao, Abdullatif Köksal, Ali Modarressi et al.

2025 EMNLP

Context Length Alone Hurts LLM Performance Despite Perfect Retrieval

Yufeng Du, Minyang Tian, Srikanth Ronanki et al.

2025 EMNLP

PROOD: A Simple LLM Out-of-Distribution Guardrail Leveraging Response Semantics

Joshua Tint

2025 EMNLP

ICL-Bandit: Relevance Labeling in Advertisement Recommendation Systems via LLM

Lu Wang, Chiming Duan, Pu Zhao et al.

2025 EMNLP

Unequal Scientific Recognition in the Age of LLMs

Yixuan Liu, Abel Elekes, Jianglin Lu et al.

2025 EMNLP

Using tournaments to calculate AUROC for zero-shot classification with LLMs

WonJin Yoon, Ian Bulovic, Timothy A. Miller

2025 EMNLP

D2CS - Documents Graph Clustering using LLM supervision

Yoel Ashkenazi, Etzion Harari, Regev Yehezkel Imra et al.

2025 EMNLP

FaStFact: Faster, Stronger Long-Form Factuality Evaluations in LLMs

Yingjia Wan, Haochen Tan, Xiao Zhu et al.

2025 EMNLP

PropXplain: Can LLMs Enable Explainable Propaganda Detection?

Maram Hasanain, Md Arid Hasan, Mohamed Bayan Kmainasi et al.

2025 EMNLP

Reveal and Release: Iterative LLM Unlearning with Self-generated Data

Linxi Xie, Xin Teng, Shichang Ke et al.

2025 EMNLP