Papers - Conftrace

Up to Par? MT Systems Take a Shot at Sports Terminology

Einar Sigurdsson, Magnús Magnússon, Atli Jasonarson et al.

2025 EMNLP

UrduFactCheck: An Agentic Fact-Checking Framework for Urdu with Evidence Boosting and Benchmarking

Sarfraz Ahmad, Hasan Iqbal, Momina Ahsan et al.

2025 EMNLP

URO-Bench: Towards Comprehensive Evaluation for End-to-End Spoken Dialogue Models

Ruiqi Yan, Xiquan Li, Wenxi Chen et al.

2025 EMNLP

Use Random Selection for Now: Investigation of Few-Shot Selection Strategies in LLM-based Text Augmentation

Jan Cegin, Branislav Pecher, Jakub Simko et al.

2025 EMNLP

User-Centric Design Paradigms for Trust and Control in Human-LLM-Interactions: A Survey

Milena Belosevic

2025 EMNLP

User Feedback in Human-LLM Dialogues: A Lens to Understand Users But Noisy as a Learning Signal

Yuhan Liu, Michael JQ Zhang, Eunsol Choi

2025 EMNLP

Using Encipherment to Isolate Conditions for the Successful Fine-tuning of Massively Multilingual Translation Models

Carter Louchheim, Denis Sotnichenko, Yukina Yamaguchi et al.

2025 EMNLP

Using tournaments to calculate AUROC for zero-shot classification with LLMs

WonJin Yoon, Ian Bulovic, Timothy A. Miller

2025 EMNLP

Utility-Focused LLM Annotation for Retrieval and Retrieval-Augmented Generation

Hengran Zhang, Minghao Tang, Keping Bi et al.

2025 EMNLP

UTMath: A Benchmark for Math Evaluation with Unit Test

Bo Yang, Qingping Yang, Yingwei Ma et al.

2025 EMNLP

UvA-MT at WMT25 Evaluation Task: LLM Uncertainty as a Proxy for Translation Quality

Di Wu, Christof Monz

2025 EMNLP

UvA-MT’s Participation in the WMT25 General Translation Shared Task

Di Wu, Yan Meng, Maya Konstantinovna Nachesa et al.

2025 EMNLP

Validate Your Authority: Benchmarking LLMs on Multi-Label Precedent Treatment Classification

M. Mikail Demir, M Abdullah Canbaz

2025 EMNLP

ValueCompass: A Framework for Measuring Contextual Value Alignment Between Human and LLMs

Hua Shen, Tiffany Knearem, Reshmi Ghosh et al.

2025 EMNLP

Value Profiles for Encoding Human Variation

Taylor Sorensen, Pushkar Mishra, Roma Patel et al.

2025 EMNLP

Variance Sensitivity Induces Attention Entropy Collapse and Instability in Transformers

Jonghyun Hong, Sungyoon Lee

2025 EMNLP

VC4VG: Optimizing Video Captions for Text-to-Video Generation

Yang Du, Zhuoran Lin, Kaiqiang Song et al.

2025 EMNLP

VCSearch: Bridging the Gap Between Well-Defined and Ill-Defined Problems in Mathematical Reasoning

Shi-Yu Tian, Zhi Zhou, Kun-Yang Yu et al.

2025 EMNLP

VehicleWorld: A Highly Integrated Multi-Device Environment for Intelligent Vehicle Interaction

Jie Yang, Jiajun Chen, Zhangyue Yin et al.

2025 EMNLP

VEHME: A Vision-Language Model For Evaluating Handwritten Mathematics Expressions

Thu Phuong Nguyen, Duc M. Nguyen, Hyotaek Jeon et al.

2025 EMNLP

VELA: An LLM-Hybrid-as-a-Judge Approach for Evaluating Long Image Captions

Kazuki Matsuda, Yuiga Wada, Shinnosuke Hirano et al.

2025 EMNLP

VENUS: A VLLM-driven Video Content Discovery System for Real Application Scenarios

Minyi Zhao, Yi Liu, Jianfeng Wen et al.

2025 EMNLP

VeriFact: Enhancing Long-Form Factuality Evaluation with Refined Fact Extraction and Reference Facts

Xin Liu, Lechen Zhang, Sheza Munir et al.

2025 EMNLP

VeriFastScore: Speeding up long-form factuality evaluation

Rishanth Rajendhran, Amir Zadeh, Matthew Sarte et al.

2025 EMNLP

VerifiAgent: a Unified Verification Agent in Language Model Reasoning

Jiuzhou Han, Wray Buntine, Ehsan Shareghi

2025 EMNLP