Xiaodong He

122 papers · 2008–2025 · 14 conferences · across top CS/AI conferences

Achievements

+16 more ↓

🏃 Academic Marathon (17) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (14) 🐝 Cross-Pollinator (13)

🧭 Keyword Pioneer 🐝 Cross-Pollinator (13) 🐣 Hot Topic Early Bird 🌟 Keyword Trendsetter Combo (6) 🏠 Conference Loyalist (23) 🏆 Grand Slam 🏆 Keyword Champion 🔬 Deep Specialist (16) 🧬 Topic Evolution 🤝 Dynamic Duo (32) 📈 Trend Setter ⚡ Prolific Year (15) 🔥 Unstoppable (15) 💎 Century Club (122) 🚀 Conference Pioneer 🗃️ Keyword Collector (434)

Conferences

EMNLP (23) ACL (19) NAACL (19) CVPR (15) INTERSPEECH (11) AAAI (10) COLING (6) IJCAI (6) IJCNLP (4) NIPS (4) CONLL (2) ECCV (1) ICLR (1) ICML (1)

Top co-authors

Youzheng Wu (32) Bowen Zhou (22) Jianfeng Gao (21) Junwei Bao (19) Li Deng (17) Meng Chen (11) Jing Huang (10) Haoran Li (10) Wen-tau Yih (7) Peng Yuan (6)

Keywords

attention mechanism (14) text generation (10) multimodal learning (8) image captioning (7) neural network (6) contrastive learning (6) representation learning (6) multi-task learning (5) semantic parsing (5) abstractive summarization (5) pre-trained language model (5) visual question answering (4) transfer learning (4) dialogue system (4) reinforcement learning (4) few-shot learning (4) model compression (4) self-supervised learning (3) text-to-image generation (3) machine reading comprehension (3)

Papers

Scaling Down Text Encoders of Text-to-Image Diffusion Models CVPR 2025 Comet: Dialog Context Fusion Mechanism for End-to-End Task-Oriented Dialog with Multi-task Learning COLING 2025 HOIGen-1M: A Large-scale Dataset for Human-Object Interaction Video Generation CVPR 2025 Embodied Multi-Modal Agent trained by an LLM from a Parallel TextWorld CVPR 2024 POCE: Primal Policy Optimization with Conservative Estimation for Multi-constraint Offline Reinforcement Learning CVPR 2024 MuEP: A Multimodal Benchmark for Embodied Planning with Foundation Models IJCAI 2024 MaskedSpeech: Context-aware Speech Synthesis with Masking Strategy INTERSPEECH 2023 Mars: Modeling Context & State Representations with Contrastive Learning for End-to-End Task-Oriented Dialog ACL 2023 AUGUST: an Automatic Generation Understudy for Synthesizing Conversational Recommendation Datasets ACL 2023 MoNET: Tackle State Momentum via Noise-Enhanced Training for Dialogue State Tracking ACL 2023 OTF: Optimal Transport based Fusion of Supervised and Self-Supervised Learning Models for Automatic Speech Recognition INTERSPEECH 2023 DiffusEmp: A Diffusion Model-Based Framework with Multi-Grained Control for Empathetic Response Generation ACL 2023 Dialog-Post: Multi-Level Self-Supervised Objectives and Hierarchical Model for Dialogue Post-Training ACL 2023 MNER-QG: An End-to-End MRC Framework for Multimodal Named Entity Recognition with Query Grounding AAAI 2023 SegCLIP: Patch Aggregation with Learnable Centers for Open-Vocabulary Semantic Segmentation ICML 2023 Leveraging Label Information for Multimodal Emotion Recognition INTERSPEECH 2023 Composable Text Controls in Latent Space with ODEs EMNLP 2023 Correctable-DST: Mitigating Historical Context Mismatch between Training and Inference for Improved Dialogue State Tracking EMNLP 2022 UniRPG: Unified Discrete Reasoning over Table and Text as Program Generation EMNLP 2022 JDDC 2.1: A Multimodal Chinese Dialogue Dataset with Joint Tasks of Query Rewriting, Response Generation, Discourse Parsing, and Summarization EMNLP 2022 P3LM: Probabilistically Permuted Prophet Language Modeling for Generative Pre-Training EMNLP 2022 MuGER2: Multi-Granularity Evidence Retrieval and Reasoning for Hybrid Question Answering EMNLP 2022 SimCTC: A Simple Contrast Learning Method of Text Clustering (Student Abstract) AAAI 2022 A Multi-Factor Classification Framework for Completing Users’ Fuzzy Queries (Student Abstract) AAAI 2022 LUNA: Learning Slot-Turn Alignment for Dialogue State Tracking NAACL 2022 Don’t Take It Literally: An Edit-Invariant Sequence Loss for Text Generation NAACL 2022 OPERA: Operation-Pivoted Discrete Reasoning over Text NAACL 2022 Label Anchored Contrastive Learning for Language Understanding NAACL 2022 BORT: Back and Denoising Reconstruction for End-to-End Task-Oriented Dialog NAACL 2022 Fine- and Coarse-Granularity Hybrid Self-Attention for Efficient BERT ACL 2022 Tracking Satisfaction States for Customer Satisfaction Prediction in E-commerce Service Chatbots COLING 2022 Few-Shot Table Understanding: A Benchmark Dataset and Pre-Training Baseline COLING 2022 Cross-modal Transfer Learning via Multi-grained Alignment for End-to-End Spoken Language Understanding INTERSPEECH 2022 SCaLa: Supervised Contrastive Learning for End-to-End Speech Recognition INTERSPEECH 2022 Learning to Generate Poetic Chinese Landscape Painting with Calligraphy IJCAI 2022 PRINCE: Prefix-Masked Decoding for Knowledge Enhanced Sequence-to-Sequence Pre-Training EMNLP 2022 RevCore: Review-Augmented Conversational Recommendation ACL 2021 Graph Ensemble Learning over Multiple Dependency Trees for Aspect-level Sentiment Classification NAACL 2021 RoR: Read-over-Read for Long Document Machine Reading Comprehension EMNLP 2021 RevCore: Review-Augmented Conversational Recommendation IJCNLP 2021 SGG: Learning to Select, Guide, and Generate for Keyphrase Generation NAACL 2021 Selective Attention Based Graph Convolutional Networks for Aspect-Level Sentiment Classification NAACL 2021 Learn to Copy from the Copying History: Correlational Copy Network for Abstractive Summarization EMNLP 2021 K-PLUG: Knowledge-injected Pre-trained Language Model for Natural Language Understanding and Generation in E-Commerce EMNLP 2021 Learning to Decouple Relations: Few-Shot Relation Classification with Entity-Guided Attention and Confusion-Aware Training COLING 2020 Enhancing Automated Essay Scoring Performance via Fine-tuning Pre-trained Language Models with Combination of Regression and Ranking EMNLP 2020 Sound Event Localization and Detection Based on Multiple DOA Beamforming and Multi-Task Learning INTERSPEECH 2020 The JD AI Speaker Verification System for the FFSVC 2020 Challenge INTERSPEECH 2020 Efficient WaveGlow: An Improved WaveGlow Vocoder with Enhanced Speed INTERSPEECH 2020 Group Contextual Encoding for 3D Point Clouds NIPS 2020 Select, Answer and Explain: Interpretable Multi-Hop Reading Comprehension over Multiple Documents AAAI 2020 Keywords-Guided Abstractive Sentence Summarization AAAI 2020 Aspect-Aware Multimodal Summarization for Chinese E-Commerce Products AAAI 2020 Zero-Shot Text-to-SQL Learning with Auxiliary Task AAAI 2020 Orthogonal Relation Transforms with Graph Context Modeling for Knowledge Graph Embedding ACL 2020 Self-Attention Guided Copy Mechanism for Abstractive Summarization ACL 2020 Multimodal Sentence Summarization via Multimodal Selective Encoding COLING 2020 On the Faithfulness for E-commerce Product Summarization COLING 2020 Multimodal Joint Attribute Prediction and Value Extraction for E-commerce Product EMNLP 2020 Speaker Diarization with Lexical Information INTERSPEECH 2019 Aligning Visual Regions and Textual Concepts for Semantic-Grounded Image Representations NIPS 2019 Multi-hop Reading Comprehension across Multiple Documents by Reasoning over Heterogeneous Graphs ACL 2019 Relation Module for Non-Answerable Predictions on Reading Comprehension CONLL 2019 Mappa Mundi: An Interactive Artistic Mind Map Generator with Artificial Imagination IJCAI 2019 Knowledgeable Storyteller: A Commonsense-Driven Generative Model for Visual Storytelling IJCAI 2019 Collaborative Learning of Semi-Supervised Segmentation and Classification for Medical Images CVPR 2019 Object-Driven Text-To-Image Synthesis via Adversarial Training CVPR 2019 Attentive Tensor Product Learning AAAI 2019 Discrete Trust-aware Matrix Factorization for Fast Recommendation IJCAI 2019 Dynamic Item Block and Prediction Enhancing Block for Sequential Recommendation IJCAI 2019 Hierarchically Structured Reinforcement Learning for Topically Coherent Visual Story Generation AAAI 2019 End-to-End Structure-Aware Convolutional Networks for Knowledge Base Completion AAAI 2019 Multi-Stride Self-Attention for Speech Recognition INTERSPEECH 2019 Direct-Path Signal Cross-Correlation Estimation for Sound Source Localization in Reverberation INTERSPEECH 2019 AttnGAN: Fine-Grained Text to Image Generation With Attentional Generative Adversarial Networks CVPR 2018 Deep Reinforcement Learning for NLP ACL 2018 Tips and Tricks for Visual Question Answering: Learnings From the 2017 Challenge CVPR 2018 CleanNet: Transfer Learning for Scalable Image Classifier Training With Label Noise CVPR 2018 Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering CVPR 2018 Stacked Cross Attention for Image-Text Matching ECCV 2018 Policy Shaping and Generalized Update Equations for Semantic Parsing from Denotations EMNLP 2018 On the Discrimination-Generalization Tradeoff in GANs ICLR 2018 Discourse-Aware Neural Rewards for Coherent Text Generation NAACL 2018 Tensor Product Generation Networks for Deep NLP Modeling NAACL 2018 Deep Communicating Agents for Abstractive Summarization NAACL 2018 Natural Language to Structured Query Generation via Meta-Learning NAACL 2018 Deep Learning With Low Precision by Half-Wave Gaussian Quantization CVPR 2017 Two-Stage Synthesis Networks for Transfer Learning in Machine Comprehension EMNLP 2017 Learning Generic Sentence Representations Using Convolutional Neural Networks EMNLP 2017 Adversarial Ranking for Language Generation NIPS 2017 StyleNet: Generating Attractive Visual Captions With Styles CVPR 2017 Semantic Compositional Networks for Visual Captioning CVPR 2017 Character-Level Question Answering with Attention EMNLP 2016 Stacked Attention Networks for Image Question Answering CVPR 2016 Generating Natural Questions About an Image ACL 2016 A Corpus and Cloze Evaluation for Deeper Understanding of Commonsense Stories NAACL 2016 Visual Storytelling NAACL 2016 Hierarchical Attention Networks for Document Classification NAACL 2016 Deep Reinforcement Learning with a Natural Language Action Space ACL 2016 Bi-directional Attention with Agreement for Dependency Parsing EMNLP 2016 Deep Reinforcement Learning with a Combinatorial Action Space for Predicting Popular Reddit Threads EMNLP 2016 Language Models for Image Captioning: The Quirks and What Works IJCNLP 2015 End-to-end Learning of LDA by Mirror-Descent Back Propagation over a Deep Architecture NIPS 2015 From Captions to Visual Concepts and Back CVPR 2015 Deep Learning and Continuous Representations for Natural Language Processing NAACL 2015 Semantic Parsing via Staged Query Graph Generation: Question Answering with Knowledge Base ACL 2015 Language Models for Image Captioning: The Quirks and What Works ACL 2015 Semantic Parsing via Staged Query Graph Generation: Question Answering with Knowledge Base IJCNLP 2015 Representation Learning Using Multi-Task Deep Neural Networks for Semantic Classification and Information Retrieval NAACL 2015 Semantic Parsing for Single-Relation Question Answering ACL 2014 Modeling Interestingness with Deep Neural Networks EMNLP 2014 Learning Continuous Phrase Representations for Translation Modeling ACL 2014 Training MRF-Based Phrase Translation Models using Gradient Ascent NAACL 2013 Maximum Expected BLEU Training of Phrase and Lexicon Translation Models ACL 2012 Learning Lexicon Models from Search Logs for Query Expansion CONLL 2012 Learning Lexicon Models from Search Logs for Query Expansion EMNLP 2012 Domain Adaptation via Pseudo In-Domain Data Selection EMNLP 2011 Joint Optimization for Machine Translation System Combination EMNLP 2009 Using N-gram based Features for Machine Translation System Combination NAACL 2009 Incremental HMM Alignment for MT System Combination ACL 2009 Incremental HMM Alignment for MT System Combination IJCNLP 2009 Indirect-HMM-based Hypothesis Alignment for Combining Outputs from Machine Translation Systems EMNLP 2008