Yuxia Wang

47 papers · 2020–2026 · 8 conferences · across top CS/AI conferences

Achievements

+11 more ↓

🏃 Academic Marathon (5) 🌍 Conference Polyglot (7) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🐣 Hot Topic Early Bird

🐝 Cross-Pollinator (11) 🌍 Conference Polyglot (7) 🏃 Academic Marathon (5) 🤝 Dynamic Duo (24) 👥 Mega-Team (35) 🔬 Deep Specialist (15) 🧬 Topic Evolution ❓ The Questioner (2) 🗃️ Keyword Collector (175) 💎 Century Club (40) ⚡ Prolific Year (5)

Conferences

ACL (20) EMNLP (10) EACL (5) NAACL (5) COLING (4) AACL (1) IJCNLP (1) SEMEVAL (1)

Top co-authors

Preslav Nakov (31) Minghan Wang (14) Iryna Gurevych (12) Jiahui Geng (10) Timothy Baldwin (10) Jonibek Mansurov (9) Artem Shelmanov (8) Zhuohan Xie (8) Tarek Mahmoud (7) Osama Mohammed Afzal (7)

Keywords

large language model (23) binary classification (7) machine-generated text detection (6) text classification (6) claim verification (5) low-resource language (5) benchmark evaluation (4) instruction tuning (4) multilingual detection (3) model safety (3) harmful content detection (3) chain-of-thought reasoning (3) evidence retrieval (3) automatic speech recognition (3) machine translation (3) multilingual nlp (3) data augmentation (3) safety evaluation (3) semantic textual similarity (3) multimodal learning (2)

Papers

AICD Bench: A Challenging Benchmark for AI-Generated Code Detection EACL 2026 FAID: Fine-grained AI-generated Text Detection using Multi-task Auxiliary and Multi-level Contrastive Learning EACL 2026 Is Human-Like Text Liked by Humans? Multilingual Human Detection and Preference Against AI ACL 2026 Stereotype Bias in a Bilingual Setting: A Culturally Grounded Evaluation in Kazakhstan ACL 2026 SAHM: A Benchmark for Arabic Financial and Shari’ah-Compliant Reasoning ACL 2026 FinChain: A Symbolic Benchmark for Verifiable Chain-of-Thought Financial Reasoning ACL 2026 HD-NDEs: Neural Differential Equations for Hallucination Detection in LLMs ACL 2025 Explicit and Implicit Data Augmentation for Social Event Detection ACL 2025 KazMMLU: Evaluating Language Models on Kazakh, Russian, and Regional Knowledge of Kazakhstan ACL 2025 UrduFactCheck: An Agentic Fact-Checking Framework for Urdu with Evidence Boosting and Benchmarking EMNLP 2025 UnsafeChain: Enhancing Reasoning Model Safety via Hard Cases IJCNLP 2025 Instruction Tuning on Public Government and Cultural Data for Low-Resource Language: a Case Study in Kazakh ACL 2025 VSCBench: Bridging the Gap in Vision-Language Model Safety Calibration ACL 2025 Arabic Dataset for LLM Safeguard Evaluation NAACL 2025 Qorǵau: Evaluating Safety in Kazakh-Russian Bilingual Contexts ACL 2025 Conversational SimulMT: Efficient Simultaneous Translation with Large Language Models ACL 2025 UnsafeChain: Enhancing Reasoning Model Safety via Hard Cases AACL 2025 OpenFactCheck: Building, Benchmarking Customized Fact-Checking Systems and Evaluating the Factuality of Claims and LLMs COLING 2025 Loki: An Open-Source Tool for Fact Verification COLING 2025 FIRE: Fact-checking with Iterative Retrieval and Verification NAACL 2025 GenAI Content Detection Task 1: English and Multilingual Machine-Generated Text Detection: AI vs. Human COLING 2025 Detection of Human and Machine-Authored Fake News in Urdu ACL 2025 Libra-Leaderboard: Towards Responsible AI through a Balanced Leaderboard of Safety and Capability NAACL 2025 LLM-DetectAIve: a Tool for Fine-Grained Machine-Generated Text Detection EMNLP 2024 M4GT-Bench: Evaluation Benchmark for Black-Box Machine-Generated Text Detection ACL 2024 Demystifying Instruction Mixing for Fine-tuning Large Language Models ACL 2024 A Chinese Dataset for Evaluating the Safeguards in Large Language Models ACL 2024 M4: Multi-generator, Multi-domain, and Multi-lingual Black-Box Machine-Generated Text Detection EACL 2024 Do-Not-Answer: Evaluating Safeguards in LLMs EACL 2024 Rethinking STS and NLI in Large Language Models EACL 2024 Factuality of Large Language Models: A Survey EMNLP 2024 OpenFactCheck: A Unified Framework for Factuality Evaluation of LLMs EMNLP 2024 Exploring the Potential of Multimodal LLM with Knowledge-Intensive Multimodal ASR EMNLP 2024 Factcheck-Bench: Fine-Grained Evaluation Benchmark for Automatic Fact-checkers EMNLP 2024 Can Machines Resonate with Humans? Evaluating the Emotional and Empathic Comprehension of LMs EMNLP 2024 A Survey of Confidence Estimation and Calibration in Large Language Models NAACL 2024 SemEval-2024 Task 8: Multidomain, Multimodel and Multilingual Machine-Generated Text Detection NAACL 2024 SemEval-2024 Task 8: Multidomain, Multimodel and Multilingual Machine-Generated Text Detection SEMEVAL 2024 The HW-TSC’s Speech to Speech Translation System for IWSLT 2022 Evaluation ACL 2022 The HW-TSC’s Simultaneous Speech Translation System for IWSLT 2022 Evaluation ACL 2022 Capture Human Disagreement Distributions by Calibrated Networks for Natural Language Inference ACL 2022 The HW-TSC’s Offline Speech Translation System for IWSLT 2022 Evaluation ACL 2022 Noisy Label Regularisation for Textual Regression COLING 2022 HW-TSC’s Participation at WMT 2021 Quality Estimation Shared Task EMNLP 2021 How Length Prediction Influence the Performance of Non-Autoregressive Translation? EMNLP 2021 Learning from Unlabelled Data for Clinical Semantic Textual Similarity EMNLP 2020 Evaluating the Utility of Model Configurations and Data Augmentation on Clinical Semantic Textual Similarity ACL 2020