Cunxiang Wang

30 papers · 2019–2026 · 9 conferences · across top CS/AI conferences

Achievements

+11 more ↓

🏃 Academic Marathon (6) 🌍 Conference Polyglot (8) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🐣 Hot Topic Early Bird

🐝 Cross-Pollinator (11) 🌍 Conference Polyglot (8) 🏃 Academic Marathon (6) 🤝 Dynamic Duo (16) 🔬 Deep Specialist (10) 🧬 Topic Evolution 🏆 Keyword Champion (2) 🗃️ Keyword Collector (102) 💎 Century Club (25) ⚡ Prolific Year (6) ❓ The Questioner (6)

Conferences

ACL (13) EMNLP (5) ICLR (3) COLING (2) NAACL (2) NIPS (2) AAAI (1) IJCNLP (1) SEMEVAL (1)

Top co-authors

Yue Zhang (16) Minlie Huang (7) Jie Tang (6) Hongning Wang (6) Zheng Zhang (4) Xiangkun Hu (4) Pei Ke (4) Xiaotao Gu (4) Bosi Wen (4) Qipeng Guo (3)

Research topics

Privacy (1)

Keywords

large language model (10) explanation generation (3) preference optimization (3) language model (3) chain-of-thought reasoning (2) pre-trained language model (2) commonsense validation (2) in-context learning (2) closed-book question answering (2) fact memorization (2) commonsense reasoning (2) critique generation (2) text generation (2) language modeling (2) knowledge base (2) open-domain question answering (2) information retrieval (1) attention mechanism (1) token selection (1) model robustness (1)

Papers

IF-RewardBench: Benchmarking Judge Models for Instruction-Following Evaluation ACL 2026 UDA: Unsupervised Debiasing Alignment for Pair-wise LLM-as-a-Judge AAAI 2026 HoWToBench: Holistic Evaluation for LLM’s Capability in Human-level Writing using Tree of Writing ACL 2026 Beyond Literal Mapping: Benchmarking and Improving Non-Literal Translation Evaluation ACL 2026 IF-CRITIC: Towards a Fine-Grained LLM Critic for Instruction-Following Evaluation ACL 2026 HPSS: Heuristic Prompting Strategy Search for LLM Evaluators ACL 2025 Training Language Model to Critique for Better Refinement ACL 2025 R3: “This is My SQL, Are You With Me?” A Consensus-Based Multi-Agent System for Text-to-SQL Tasks ACL 2025 How Likely Do LLMs with CoT Mimic Human Reasoning? COLING 2025 CPRM: A LLM-based Continual Pre-training Framework for Relevance Modeling in Commercial Search NAACL 2025 Unlocking Recursive Thinking of LLMs: Alignment via Refinement ACL 2025 NovelQA: Benchmarking Question Answering on Documents Exceeding 200K Tokens ICLR 2025 SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models ICLR 2025 LongSafety: Evaluating Long-Context Safety of Large Language Models ACL 2025 Self-DC: When to Reason and When to Act? Self Divide-and-Conquer for Compositional Unknown Questions NAACL 2025 Knowledge Conflicts for LLMs: A Survey EMNLP 2024 RAGChecker: A Fine-grained Framework for Diagnosing Retrieval-Augmented Generation NIPS 2024 SHIELD: Evaluation and Defense Strategies for Copyright Compliance in LLM Text Generation EMNLP 2024 Nash CoT: Multi-Path Inference with Preference Equilibrium EMNLP 2024 LONG2RAG: Evaluating Long-Context & Long-Form Retrieval-Augmented Generation with Key Point Recall EMNLP 2024 PandaLM: An Automatic Evaluation Benchmark for LLM Instruction Tuning Optimization ICLR 2024 Exploiting Abstract Meaning Representation for Open-Domain Question Answering ACL 2023 TRAMS: Training-free Memory Selection for Long-range Language Modeling EMNLP 2023 Evaluating Open-QA Evaluation NIPS 2023 RFiD: Towards Rational Fusion-in-Decoder for Open-Domain Question Answering ACL 2023 Can Generative Pre-trained Language Models Serve As Knowledge Bases for Closed-book QA? ACL 2021 Can Generative Pre-trained Language Models Serve As Knowledge Bases for Closed-book QA? IJCNLP 2021 SemEval-2020 Task 4: Commonsense Validation and Explanation COLING 2020 SemEval-2020 Task 4: Commonsense Validation and Explanation SEMEVAL 2020 Does it Make Sense? And Why? A Pilot Study for Sense Making and Explanation ACL 2019