Xingyuan Bu

13 papers · 2020–2026 · 7 conferences · across top CS/AI conferences

Achievements

+9 more ↓

🌍 Conference Polyglot (6) 🌉 Interdisciplinary Bridge 🗺️ Taxonomy Completionist (12) 🧭 Keyword Pioneer 🏃 Academic Marathon (5)

🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (6) 🏃 Academic Marathon (5) 👥 Mega-Team (32) 🗃️ Keyword Collector (80) ⚡ Prolific Year (6) 💎 Century Club (12) 📈 Trend Setter ❓ The Questioner

Conferences

ACL (5) CVPR (2) NAACL (2) EACL (1) EMNLP (1) IJCAI (1) NIPS (1)

Top co-authors

Jiaheng Liu (9) Wenbo Su (6) Bo Zheng (6) Yancheng He (6) Shilong Li (5) Zhaoxiang Zhang (4) Hui Huang (4) Hangyu Guo (4) Jie Liu (4) Xingwei Qu (3)

Keywords

large language model (8) benchmark evaluation (2) object detection (2) question answering (2) direct preference optimization (2) fine-grained evaluation (2) domain adaptation (1) chain-of-thought reasoning (1) transfer learning (1) preference learning (1) model selection (1) mathematical reasoning (1) dialogue generation (1) factuality evaluation (1) multi-label classification (1) data annotation (1) natural language understanding (1) llm evaluation (1) dialogue state tracking (1) reward modeling (1)

Papers

COIG-P: A High-Quality and Large-Scale Chinese Preference Dataset for Alignment with Human Values EACL 2026 Can Large Language Models Detect Errors in Long Chain-of-Thought Reasoning? ACL 2025 Seeing the Unseen: Composing Outliers for Compositional Zero-Shot Learning IJCAI 2025 2D-DPO: Scaling Direct Preference Optimization with 2-Dimensional Supervision NAACL 2025 Chinese SimpleQA: A Chinese Factuality Evaluation for Large Language Models ACL 2025 An Empirical Study of LLM-as-a-Judge for LLM Evaluation: Fine-tuned Judge Model is not a General Substitute for GPT-4 ACL 2025 DREAM: Disentangling Risks to Enhance Safety Alignment in Multimodal Large Language Models NAACL 2025 RoleAgent: Building, Interacting, and Benchmarking High-quality Role-Playing Agents from Scripts NIPS 2024 MT-Bench-101: A Fine-Grained Benchmark for Evaluating Large Language Models in Multi-Turn Dialogues ACL 2024 ConceptMath: A Bilingual Concept-wise Benchmark for Measuring Mathematical Reasoning of Large Language Models ACL 2024 GraphReader: Building Graph-based Agent to Enhance Long-Context Abilities of Large Language Models EMNLP 2024 GAIA: A Transfer Learning System of Object Detection That Fits Your Needs CVPR 2021 Large-Scale Object Detection in the Wild From Imbalanced Multi-Labels CVPR 2020