conftrace_

Wenhu Chen

90 papers · 2018–2026 · 15 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓

+17 more ↓

🗺️ Taxonomy Completionist (13) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (6) 🐣 Hot Topic Early Bird

🌉 Interdisciplinary Bridge 🗺️ Taxonomy Completionist (13) 🐝 Cross-Pollinator (14) 🏠 Conference Loyalist (23) 🤝 Dynamic Duo (29) 👑 Triple Crown 🏆 Keyword Champion (2) 🏆 Grand Slam 👥 Mega-Team (32) 🔬 Deep Specialist (21) 🧬 Topic Evolution 📈 Trend Setter ❓ The Questioner 🗃️ Keyword Collector (368) 💎 Century Club (89) ⚡ Prolific Year (11) 🔥 Unstoppable (8)

Conferences

ACL (24) EMNLP (17) ICLR (12) NIPS (11) NAACL (6) CVPR (5) IJCNLP (3) AAAI (2) EACL (2) ICML (2) WACV (2) AACL (1) ECCV (1) ICCV (1) INTERSPEECH (1)

Top co-authors

William Yang Wang (29) Ge Zhang (16) Yu Su (13) Xiang Yue (12) Xifeng Yan (9) Liangming Pan (8) Yuansheng Ni (7) Zhiyu Chen (7) Jie Fu (7) Xueguang Ma (7)

Research topics

Reasoning (1) Education (1)

Keywords

large language model (16) multimodal learning (11) question answering (11) vision-language model (9) few-shot learning (8) benchmark evaluation (6) in-context learning (5) video generation (5) visual reasoning (4) video understanding (4) instruction tuning (4) multimodal reasoning (4) question-answer pair (4) pre-trained language model (4) zero-shot learning (4) retrieval-augmented generation (4) text generation (3) reinforcement learning (3) knowledge base (3) data augmentation (3)

Papers

BrowseComp-Plus: A Fair and Disentangled Evaluation Benchmark for Deep Search Agents ACL 2026 VISA: Retrieval Augmented Generation with Visual Source Attribution ACL 2025 MMMU-Pro: A More Robust Multi-discipline Multimodal Understanding Benchmark ACL 2025 MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale ACL 2025 ACECODER: Acing Coder RL via Automated Test-Case Synthesis ACL 2025 TheoremExplainAgent: Towards Video-based Multimodal Explanations for LLM Theorem Understanding ACL 2025 UniRAG: Universal Retrieval Augmentation for Large Vision Language Models NAACL 2025 MEGA-Bench: Scaling Multimodal Evaluation to over 500 Real-World Tasks ICLR 2025 OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision ICLR 2025 T2V-Turbo-v2: Enhancing Video Model Post-Training through Data, Reward, and Conditional Guidance Design ICLR 2025 VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks ICLR 2025 Harnessing Webpage UIs for Text-Rich Visual Understanding ICLR 2025 Vamba: Understanding Hour-Long Videos with Hybrid Mamba-Transformers ICCV 2025 VisCoder: Fine-Tuning LLMs for Executable Python Visualization Code Generation EMNLP 2025 Unleashing the Reasoning Potential of LLMs by Critique Fine-Tuning on One Problem EMNLP 2025 VisualWebInstruct: Scaling up Multimodal Instruction Data through Web Search EMNLP 2025 TC-Bench: Benchmarking Temporal Compositionality in Conditional Video Generation ACL 2025 VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by Video Spatiotemporal Augmentation CVPR 2025 MERT: Acoustic Music Understanding Model with Large-Scale Self-supervised Training ICLR 2024 WildVision: Evaluating Vision-Language Models in the Wild with Human Preferences NIPS 2024 T2V-Turbo: Breaking the Quality Bottleneck of Video Consistency Model with Mixed Reward Feedback NIPS 2024 GenAI Arena: An Open Evaluation Platform for Generative Models NIPS 2024 MAmmoTH2: Scaling Instructions from the Web NIPS 2024 MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark NIPS 2024 VIEScore: Towards Explainable Metrics for Conditional Image Synthesis Evaluation ACL 2024 E2-LLM: Efficient and Extreme Length Extension of Large Language Models ACL 2024 ChatMusician: Understanding and Generating Music Intrinsically with LLM ACL 2024 Knowledge of Knowledge: Exploring Known-Unknowns Uncertainty with Large Language Models ACL 2024 SciMMIR: Benchmarking Scientific Multi-modal Information Retrieval ACL 2024 OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement ACL 2024 MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI CVPR 2024 Instruct-Imagen: Image Generation with Multi-modal Instruction CVPR 2024 UniIR: Training and Benchmarking Universal Multimodal Information Retrievers ECCV 2024 VideoScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation EMNLP 2024 Unifying Multimodal Retrieval via Document Screenshot Embedding EMNLP 2024 Augmenting Black-box LLMs with Medical Textbooks for Biomedical Question Answering EMNLP 2024 Kosmos-G: Generating Images in Context with Multimodal Large Language Models ICLR 2024 ImagenHub: Standardizing the evaluation of conditional image generation models ICLR 2024 MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning ICLR 2024 Understanding Reasoning Ability of Language Models From the Perspective of Reasoning Paths Aggregation ICML 2024 MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions ICML 2024 MusiLingo: Bridging Music and Text with Pre-trained Language Models for Music Captioning and Query Response NAACL 2024 Synthesizing Coherent Story With Auto-Regressive Latent Diffusion Models WACV 2024 QA Is the New KR: Question-Answer Pairs as Knowledge Bases AAAI 2023 Re-Imagen: Retrieval-Augmented Text-to-Image Generator ICLR 2023 Attacking Open-domain Question Answering by Injecting Misinformation AACL 2023 Large Language Models are few(1)-shot Table Reasoners EACL 2023 Few-shot In-context Learning on Knowledge Base Question Answering ACL 2023 Subject-driven Text-to-Image Generation via Apprenticeship Learning NIPS 2023 MagicBrush: A Manually Annotated Dataset for Instruction-Guided Image Editing NIPS 2023 MARBLE: Music Audio Representation Benchmark for Universal Evaluation NIPS 2023 Attacking Open-domain Question Answering by Injecting Misinformation IJCNLP 2023 Augmenting Pre-trained Language Models with QA-Memory for Open-Domain Question Answering EACL 2023 EDIS: Entity-Driven Image Search over Multimodal Web Content EMNLP 2023 TheoremQA: A Theorem-driven Question Answering Dataset EMNLP 2023 On the Risk of Misinformation Pollution with Large Language Models EMNLP 2023 DePlot: One-shot visual language reasoning by plot-to-table translation ACL 2023 HybriDialogue: An Information-Seeking Dialogue Dataset Grounded on Tabular and Textual Data ACL 2022 MuRAG: Multimodal Retrieval-Augmented Generator for Open Question Answering over Images and Text EMNLP 2022 Controllable Dialogue Simulation with In-context Learning EMNLP 2022 Counterfactual Maximum Likelihood Estimation for Training Deep Networks NIPS 2021 Zero-shot Fact Verification by Claim Generation IJCNLP 2021 Task-adaptive Pre-training and Self-training are Complementary for Natural Language Understanding EMNLP 2021 FinQA: A Dataset of Numerical Reasoning over Financial Data EMNLP 2021 Open Question Answering over Tables and Text ICLR 2021 A Systematic Investigation of KB-Text Embedding Alignment at Scale ACL 2021 Zero-shot Fact Verification by Claim Generation ACL 2021 Meta Module Network for Compositional Visual Reasoning WACV 2021 Unsupervised Multi-hop Question Answering by Question Generation NAACL 2021 Local Explanation of Dialogue Response Generation NIPS 2021 A Systematic Investigation of KB-Text Embedding Alignment at Scale IJCNLP 2021 Few-Shot NLG with Pre-Trained Language Model ACL 2020 Violin: A Large-Scale Dataset for Video-and-Language Inference CVPR 2020 TabFact: A Large-scale Dataset for Table-based Fact Verification ICLR 2020 Generative Adversarial Zero-Shot Relational Learning for Knowledge Graphs AAAI 2020 Logic2Text: High-Fidelity Natural Language Generation from Logical Forms EMNLP 2020 HybridQA: A Dataset of Multi-Hop Question Answering over Tabular and Textual Data EMNLP 2020 KGPT: Knowledge-Grounded Pre-Training for Data-to-Text Generation EMNLP 2020 Logical Natural Language Generation from Open-Domain Tables ACL 2020 Semantically Conditioned Dialog Response Generation via Hierarchical Disentangled Self-Attention ACL 2019 How Large a Vocabulary Does Text Classification Need? A Variational Approach to Vocabulary Selection NAACL 2019 Interpreting and Improving Deep Neural SLU Models via Vocabulary Importance INTERSPEECH 2019 Enhancing the Locality and Breaking the Memory Bottleneck of Transformer on Time Series Forecasting NIPS 2019 Global Textual Relation Embedding for Relational Understanding ACL 2019 XL-NBT: A Cross-lingual Neural Belief Tracking Framework EMNLP 2018 Triangular Architecture for Rare Language Translation ACL 2018 Variational Knowledge Graph Reasoning NAACL 2018 Generative Bridging Network for Neural Sequence Prediction NAACL 2018 Video Captioning via Hierarchical Reinforcement Learning CVPR 2018 No Metrics Are Perfect: Adversarial Reward Learning for Visual Storytelling ACL 2018