Papers

2,781 papers found

A Structured Framework for Evaluating and Enhancing Interpretive Capabilities of Multimodal LLMs in Culturally Situated Tasks

Haorui Yu, Ramon Ruiz-Dolz, Qiufeng Yi

2025 EMNLP

Bridging the Editing Gap in LLMs: FineEdit for Precise and Targeted Text Modifications

Yiming Zeng, Wanhao Yu, Zexin Li et al.

2025 EMNLP

Dynamic Evaluation for Oversensitivity in LLMs

Sophia Xiao Pu, Sitao Cheng, Xin Eric Wang et al.

2025 EMNLP

Toward Inclusive Language Models: Sparsity-Driven Calibration for Systematic and Interpretable Mitigation of Social Biases in LLMs

Prommy Sultana Hossain, Chahat Raj, Ziwei Zhu et al.

2025 EMNLP

Advancing Reasoning with Off-the-Shelf LLMs: A Semantic Structure Perspective

Pengfei He, Zitao Li, Yue Xing et al.

2025 EMNLP

PromptKeeper: Safeguarding System Prompts for LLMs

Zhifeng Jiang, Zhihua Jin, Guoliang He

2025 EMNLP

Automating eHMI Action Design with LLMs for Automated Vehicle Communication

Ding Xia, Xinyue Gui, Fan Gao et al.

2025 EMNLP

VisCoder: Fine-Tuning LLMs for Executable Python Visualization Code Generation

Yuansheng Ni, Ping Nie, Kai Zou et al.

2025 EMNLP

Distill Visual Chart Reasoning Ability from LLMs to MLLMs

Wei He, Zhiheng Xi, Wanxu Zhao et al.

2025 EMNLP

SKA-Bench: A Fine-Grained Benchmark for Evaluating Structured Knowledge Understanding of LLMs

Zhiqiang Liu, Enpei Niu, Yin Hua et al.

2025 EMNLP

From Implicit Exploration to Structured Reasoning: Guideline and Refinement for LLMs

Jiaxiang Chen, Zhuo Wang, Mingxi Zou et al.

2025 EMNLP

Recipe2Plan: Evaluating Planning Abilities of LLMs for Efficient and Feasible Multitasking with Time Constraints Between Actions

Zirui Wu, Xiao Liu, Jiayi Li et al.

2025 EMNLP

Can Multimodal LLMs See Materials Clearly? A Multimodal Benchmark on Materials Characterization

Zhengzhao Lai, Youbin Zheng, Zhenyang Cai et al.

2025 EMNLP

Cross-Cultural Transfer of Commonsense Reasoning in LLMs: Evidence from the Arab World

Saeed Almheiri, Rania Elbadry, Mena Attia et al.

2025 EMNLP

LLMEval-Med: A Real-world Clinical Benchmark for Medical LLMs with Physician Validation

Ming Zhang, Yujiong Shen, Zelin Li et al.

2025 EMNLP

GenPoE: Generative Passage-level Mixture of Experts for Knowledge Enhancement of LLMs

Xuebing Liu, Shanbao Qiao, Seung-Hoon Na

2025 EMNLP

X-Boundary: Establishing Exact Safety Boundary to Shield LLMs from Jailbreak Attacks without Compromising Usability

Xiaoya Lu, Dongrui Liu, Yi Yu et al.

2025 EMNLP

The “r” in “woman” stands for rights. Auditing LLMs in Uncovering Social Dynamics in Implicit Misogyny

Arianna Muti, Chris Emmery, Debora Nozza et al.

2025 EMNLP

LLMs are Privacy Erasable

Zipeng Ye, Wenjian Luo

2025 EMNLP

CANDY: Benchmarking LLMs’ Limitations and Assistive Potential in Chinese Misinformation Fact-Checking

Ruiling Guo, Xinwei Yang, Chen Huang et al.

2025 EMNLP

Do LLMs Know and Understand Domain Conceptual Knowledge?

Sijia Shen, Feiyan Jiang, Peiyan Wang et al.

2025 EMNLP

Can LLMs Find a Needle in a Haystack? A Look at Anomaly Detection Language Modeling

Leslie Barrett, Vikram Sunil Bajaj, Robert John Kingan

2025 EMNLP

Self-Correction Makes LLMs Better Parsers

Ziyan Zhang, Yang Hou, Chen Gong et al.

2025 EMNLP

Mitigating Gender Bias via Fostering Exploratory Thinking in LLMs

Kangda Wei, Hasnat Md Abdullah, Ruihong Huang

2025 EMNLP

PersonaGym: Evaluating Persona Agents and LLMs

Vinay Samuel, Henry Peng Zou, Yue Zhou et al.

2025 EMNLP