conftrace_

Cunxiang Wang

30 papers · 2019–2026 · 9 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓
+11 more ↓ πŸƒ Academic Marathon (6) 🌍 Conference Polyglot (8) πŸŒ‰ Interdisciplinary Bridge 🧭 Keyword Pioneer 🐣 Hot Topic Early Bird
🐝 Cross-Pollinator (11) 🌍 Conference Polyglot (8) πŸƒ Academic Marathon (6) 🀝 Dynamic Duo (16) πŸ”¬ Deep Specialist (10) 🧬 Topic Evolution πŸ† Keyword Champion (2) πŸ—ƒοΈ Keyword Collector (102) πŸ’Ž Century Club (25) ⚑ Prolific Year (6) ❓ The Questioner (6)

Conferences

ACL (13) EMNLP (5) ICLR (3) COLING (2) NAACL (2) NIPS (2) AAAI (1) IJCNLP (1) SEMEVAL (1)

Research topics

Papers

IF-RewardBench: Benchmarking Judge Models for Instruction-Following Evaluation ACL 2026 UDA: Unsupervised Debiasing Alignment for Pair-wise LLM-as-a-Judge AAAI 2026 HoWToBench: Holistic Evaluation for LLM’s Capability in Human-level Writing using Tree of Writing ACL 2026 Beyond Literal Mapping: Benchmarking and Improving Non-Literal Translation Evaluation ACL 2026 IF-CRITIC: Towards a Fine-Grained LLM Critic for Instruction-Following Evaluation ACL 2026 HPSS: Heuristic Prompting Strategy Search for LLM Evaluators ACL 2025 Training Language Model to Critique for Better Refinement ACL 2025 R3: β€œThis is My SQL, Are You With Me?” A Consensus-Based Multi-Agent System for Text-to-SQL Tasks ACL 2025 How Likely Do LLMs with CoT Mimic Human Reasoning? COLING 2025 CPRM: A LLM-based Continual Pre-training Framework for Relevance Modeling in Commercial Search NAACL 2025 Unlocking Recursive Thinking of LLMs: Alignment via Refinement ACL 2025 NovelQA: Benchmarking Question Answering on Documents Exceeding 200K Tokens ICLR 2025 SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models ICLR 2025 LongSafety: Evaluating Long-Context Safety of Large Language Models ACL 2025 Self-DC: When to Reason and When to Act? Self Divide-and-Conquer for Compositional Unknown Questions NAACL 2025 Knowledge Conflicts for LLMs: A Survey EMNLP 2024 RAGChecker: A Fine-grained Framework for Diagnosing Retrieval-Augmented Generation NIPS 2024 SHIELD: Evaluation and Defense Strategies for Copyright Compliance in LLM Text Generation EMNLP 2024 Nash CoT: Multi-Path Inference with Preference Equilibrium EMNLP 2024 LONG2RAG: Evaluating Long-Context & Long-Form Retrieval-Augmented Generation with Key Point Recall EMNLP 2024 PandaLM: An Automatic Evaluation Benchmark for LLM Instruction Tuning Optimization ICLR 2024 Exploiting Abstract Meaning Representation for Open-Domain Question Answering ACL 2023 TRAMS: Training-free Memory Selection for Long-range Language Modeling EMNLP 2023 Evaluating Open-QA Evaluation NIPS 2023 RFiD: Towards Rational Fusion-in-Decoder for Open-Domain Question Answering ACL 2023 Can Generative Pre-trained Language Models Serve As Knowledge Bases for Closed-book QA? ACL 2021 Can Generative Pre-trained Language Models Serve As Knowledge Bases for Closed-book QA? IJCNLP 2021 SemEval-2020 Task 4: Commonsense Validation and Explanation COLING 2020 SemEval-2020 Task 4: Commonsense Validation and Explanation SEMEVAL 2020 Does it Make Sense? And Why? A Pilot Study for Sense Making and Explanation ACL 2019