Research Explorer

Can LLMs Find a Needle in a Haystack? A Look at Anomaly Detection Language Modeling

Leslie Barrett, Vikram Sunil Bajaj, Robert John Kingan

2025 EMNLP

Can LLMs Generate and Solve Linguistic Olympiad Puzzles?

Neh Majmudar, Elena Filatova

2025 EMNLP

Can LLMs Help You at Work? A Sandbox for Evaluating LLM Agents in Enterprise Environments

Harsh Vishwakarma, Ankush Agarwal, Ojas Patil et al.

2025 EMNLP

Can LLMs Judge Debates? Evaluating Non-Linear Reasoning via Argumentation Theory Semantics

Reza Sanayei, Srdjan Vesic, Eduardo Blanco et al.

2025 EMNLP

Can LLMs Narrate Tabular Data? An Evaluation Framework for Natural Language Representations of Text-to-SQL System Outputs

Jyotika Singh, Weiyi Sun, Amit Agarwal et al.

2025 EMNLP

Can LLMs Reason Abstractly Over Math Word Problems Without CoT? Disentangling Abstract Formulation From Arithmetic Computation

Ziling Cheng, Meng Cao, Leila Pishdad et al.

2025 EMNLP

Can LLMs simulate the same correct solutions to free-response math problems as real students?

Yuya Asano, Diane Litman, Erin Walker

2025 EMNLP

Can LLMs Truly Plan? A Comprehensive Evaluation of Planning Capabilities

Gayeon Jung, HyeonSeok Lim, Minjun Kim et al.

2025 EMNLP

Can Multimodal LLMs See Materials Clearly? A Multimodal Benchmark on Materials Characterization

Zhengzhao Lai, Youbin Zheng, Zhenyang Cai et al.

2025 EMNLP

Can Multiple Responses from an LLM Reveal the Sources of Its Uncertainty?

Yang Nan, Pengfei He, Ravi Tandon et al.

2025 EMNLP

Can Out-of-Distribution Evaluations Uncover Reliance on Prediction Shortcuts? A Case Study in Question Answering

Michal Štefánik, Timothee Mickus, Michal Spiegel et al.

2025 EMNLP

Can Prompts Rewind Time for LLMs? Evaluating the Effectiveness of Prompted Knowledge Cutoffs

Xin Gao, Ruiyi Zhang, Daniel Du et al.

2025 EMNLP

Can Role Vectors Affect LLM Behaviour?

Daniele Potertì, Andrea Seveso, Fabio Mercorio

2025 EMNLP

Can Vision-Language Models Infer Speaker’s Ignorance? The Role of Visual and Linguistic Cues

Ye-eun Cho, Yunho Maeng

2025 EMNLP

Can Vision-Language Models Solve Visual Math Equations?

Monjoy Narayan Choudhury, Junling Wang, Yifan Hou et al.

2025 EMNLP

Can VLMs Recall Factual Associations From Visual References?

Dhananjay Ashok, Ashutosh Chaubey, Hirona Jacqueline Arai et al.

2025 EMNLP

Can We Edit LLMs for Long-Tail Biomedical Knowledge?

Xinhao Yi, Jake Lever, Kevin Bryson et al.

2025 EMNLP

Can We Steer Reasoning Direction by Thinking Intervention?

Xingsheng Zhang, Luxi Xing, Chen Zhang et al.

2025 EMNLP

Can you SPLICE it together? A Human Curated Benchmark for Probing Visual Reasoning in VLMs

Mohamad Ballout, Okajevo Wilfred, Seyedalireza Yaghoubi et al.

2025 EMNLP

Can You Trick the Grader? Adversarial Persuasion of LLM Judges

Yerin Hwang, Dongryeol Lee, Taegwan Kang et al.

2025 EMNLP

CAPE: Context-Aware Personality Evaluation Framework for Large Language Models

Jivnesh Sandhan, Fei Cheng, Tushar Sandhan et al.

2025 EMNLP

CAPSTONE: Composable Attribute‐Prompted Scene Translation for Zero‐Shot Vision–Language Reasoning

Md. Ismail Hossain, Shahriyar Zaman Ridoy, Moshiur Farazi et al.

2025 EMNLP

Captioning for Text-Video Retrieval via Dual-Group Direct Preference Optimization

Ji Soo Lee, Byungoh Ko, Jaewon Cho et al.

2025 EMNLP

Capturing Intra-Dialectal Variation in Qatari Arabic: A Corpus of Cultural and Gender Dimensions

Houda Bouamor, Sara Al-Emadi, Zeinab Ibrahim et al.

2025 EMNLP

Capturing Latent Modal Association For Multimodal Entity Alignment

Yongquan Ji, Jingwei Cheng, Fu Zhang et al.

2025 EMNLP

Papers