Xiaodong He
122 papers · 2008–2025 · 14 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+16 more ↓ Show less ↑
π Academic Marathon (17) π§ Keyword Pioneer π Interdisciplinary Bridge π Conference Polyglot (14) π Cross-Pollinator (13)
π§
Keyword Pioneer
π
Cross-Pollinator
(13)
π£
Hot Topic Early Bird
π
Keyword Trendsetter Combo
(6)
π
Conference Loyalist
(23)
π
Grand Slam
π
Keyword Champion
π¬
Deep Specialist
(16)
π§¬
Topic Evolution
π€
Dynamic Duo
(32)
π
Trend Setter
β‘
Prolific Year
(15)
π₯
Unstoppable
(15)
π
Century Club
(122)
π
Conference Pioneer
ποΈ
Keyword Collector
(434)
Conferences
EMNLP (23)
ACL (19)
NAACL (19)
CVPR (15)
INTERSPEECH (11)
AAAI (10)
COLING (6)
IJCAI (6)
IJCNLP (4)
NIPS (4)
CONLL (2)
ECCV (1)
ICLR (1)
ICML (1)
Top co-authors
Keywords
attention mechanism
(14)
text generation
(10)
multimodal learning
(8)
image captioning
(7)
neural network
(6)
contrastive learning
(6)
representation learning
(6)
multi-task learning
(5)
semantic parsing
(5)
abstractive summarization
(5)
pre-trained language model
(5)
visual question answering
(4)
transfer learning
(4)
dialogue system
(4)
reinforcement learning
(4)
few-shot learning
(4)
model compression
(4)
self-supervised learning
(3)
text-to-image generation
(3)
machine reading comprehension
(3)
Papers
Scaling Down Text Encoders of Text-to-Image Diffusion Models
CVPR 2025
Comet: Dialog Context Fusion Mechanism for End-to-End Task-Oriented Dialog with Multi-task Learning
COLING 2025
HOIGen-1M: A Large-scale Dataset for Human-Object Interaction Video Generation
CVPR 2025
Embodied Multi-Modal Agent trained by an LLM from a Parallel TextWorld
CVPR 2024
POCE: Primal Policy Optimization with Conservative Estimation for Multi-constraint Offline Reinforcement Learning
CVPR 2024
MuEP: A Multimodal Benchmark for Embodied Planning with Foundation Models
IJCAI 2024
MaskedSpeech: Context-aware Speech Synthesis with Masking Strategy
INTERSPEECH 2023
Mars: Modeling Context & State Representations with Contrastive Learning for End-to-End Task-Oriented Dialog
ACL 2023
AUGUST: an Automatic Generation Understudy for Synthesizing Conversational Recommendation Datasets
ACL 2023
MoNET: Tackle State Momentum via Noise-Enhanced Training for Dialogue State Tracking
ACL 2023
OTF: Optimal Transport based Fusion of Supervised and Self-Supervised Learning Models for Automatic Speech Recognition
INTERSPEECH 2023
DiffusEmp: A Diffusion Model-Based Framework with Multi-Grained Control for Empathetic Response Generation
ACL 2023
Dialog-Post: Multi-Level Self-Supervised Objectives and Hierarchical Model for Dialogue Post-Training
ACL 2023
MNER-QG: An End-to-End MRC Framework for Multimodal Named Entity Recognition with Query Grounding
AAAI 2023
SegCLIP: Patch Aggregation with Learnable Centers for Open-Vocabulary Semantic Segmentation
ICML 2023
Leveraging Label Information for Multimodal Emotion Recognition
INTERSPEECH 2023
Composable Text Controls in Latent Space with ODEs
EMNLP 2023
Correctable-DST: Mitigating Historical Context Mismatch between Training and Inference for Improved Dialogue State Tracking
EMNLP 2022
UniRPG: Unified Discrete Reasoning over Table and Text as Program Generation
EMNLP 2022
JDDC 2.1: A Multimodal Chinese Dialogue Dataset with Joint Tasks of Query Rewriting, Response Generation, Discourse Parsing, and Summarization
EMNLP 2022
P3LM: Probabilistically Permuted Prophet Language Modeling for Generative Pre-Training
EMNLP 2022
MuGER2: Multi-Granularity Evidence Retrieval and Reasoning for Hybrid Question Answering
EMNLP 2022
SimCTC: A Simple Contrast Learning Method of Text Clustering (Student Abstract)
AAAI 2022
A Multi-Factor Classification Framework for Completing Usersβ Fuzzy Queries (Student Abstract)
AAAI 2022
LUNA: Learning Slot-Turn Alignment for Dialogue State Tracking
NAACL 2022
Donβt Take It Literally: An Edit-Invariant Sequence Loss for Text Generation
NAACL 2022
OPERA: Operation-Pivoted Discrete Reasoning over Text
NAACL 2022
Label Anchored Contrastive Learning for Language Understanding
NAACL 2022
BORT: Back and Denoising Reconstruction for End-to-End Task-Oriented Dialog
NAACL 2022
Fine- and Coarse-Granularity Hybrid Self-Attention for Efficient BERT
ACL 2022
Tracking Satisfaction States for Customer Satisfaction Prediction in E-commerce Service Chatbots
COLING 2022
Few-Shot Table Understanding: A Benchmark Dataset and Pre-Training Baseline
COLING 2022
Cross-modal Transfer Learning via Multi-grained Alignment for End-to-End Spoken Language Understanding
INTERSPEECH 2022
SCaLa: Supervised Contrastive Learning for End-to-End Speech Recognition
INTERSPEECH 2022
Learning to Generate Poetic Chinese Landscape Painting with Calligraphy
IJCAI 2022
PRINCE: Prefix-Masked Decoding for Knowledge Enhanced Sequence-to-Sequence Pre-Training
EMNLP 2022
RevCore: Review-Augmented Conversational Recommendation
ACL 2021
Graph Ensemble Learning over Multiple Dependency Trees for Aspect-level Sentiment Classification
NAACL 2021
RoR: Read-over-Read for Long Document Machine Reading Comprehension
EMNLP 2021
RevCore: Review-Augmented Conversational Recommendation
IJCNLP 2021
SGG: Learning to Select, Guide, and Generate for Keyphrase Generation
NAACL 2021
Selective Attention Based Graph Convolutional Networks for Aspect-Level Sentiment Classification
NAACL 2021
Learn to Copy from the Copying History: Correlational Copy Network for Abstractive Summarization
EMNLP 2021
K-PLUG: Knowledge-injected Pre-trained Language Model for Natural Language Understanding and Generation in E-Commerce
EMNLP 2021
Learning to Decouple Relations: Few-Shot Relation Classification with Entity-Guided Attention and Confusion-Aware Training
COLING 2020
Enhancing Automated Essay Scoring Performance via Fine-tuning Pre-trained Language Models with Combination of Regression and Ranking
EMNLP 2020
Sound Event Localization and Detection Based on Multiple DOA Beamforming and Multi-Task Learning
INTERSPEECH 2020
The JD AI Speaker Verification System for the FFSVC 2020 Challenge
INTERSPEECH 2020
Efficient WaveGlow: An Improved WaveGlow Vocoder with Enhanced Speed
INTERSPEECH 2020
Group Contextual Encoding for 3D Point Clouds
NIPS 2020
Select, Answer and Explain: Interpretable Multi-Hop Reading Comprehension over Multiple Documents
AAAI 2020
Keywords-Guided Abstractive Sentence Summarization
AAAI 2020
Aspect-Aware Multimodal Summarization for Chinese E-Commerce Products
AAAI 2020
Zero-Shot Text-to-SQL Learning with Auxiliary Task
AAAI 2020
Orthogonal Relation Transforms with Graph Context Modeling for Knowledge Graph Embedding
ACL 2020
Self-Attention Guided Copy Mechanism for Abstractive Summarization
ACL 2020
Multimodal Sentence Summarization via Multimodal Selective Encoding
COLING 2020
On the Faithfulness for E-commerce Product Summarization
COLING 2020
Multimodal Joint Attribute Prediction and Value Extraction for E-commerce Product
EMNLP 2020
Speaker Diarization with Lexical Information
INTERSPEECH 2019
Aligning Visual Regions and Textual Concepts for Semantic-Grounded Image Representations
NIPS 2019
Multi-hop Reading Comprehension across Multiple Documents by Reasoning over Heterogeneous Graphs
ACL 2019
Relation Module for Non-Answerable Predictions on Reading Comprehension
CONLL 2019
Mappa Mundi: An Interactive Artistic Mind Map Generator with Artificial Imagination
IJCAI 2019
Knowledgeable Storyteller: A Commonsense-Driven Generative Model for Visual Storytelling
IJCAI 2019
Collaborative Learning of Semi-Supervised Segmentation and Classification for Medical Images
CVPR 2019
Object-Driven Text-To-Image Synthesis via Adversarial Training
CVPR 2019
Attentive Tensor Product Learning
AAAI 2019
Discrete Trust-aware Matrix Factorization for Fast Recommendation
IJCAI 2019
Dynamic Item Block and Prediction Enhancing Block for Sequential Recommendation
IJCAI 2019
Hierarchically Structured Reinforcement Learning for Topically Coherent Visual Story Generation
AAAI 2019
End-to-End Structure-Aware Convolutional Networks for Knowledge Base Completion
AAAI 2019
Multi-Stride Self-Attention for Speech Recognition
INTERSPEECH 2019
Direct-Path Signal Cross-Correlation Estimation for Sound Source Localization in Reverberation
INTERSPEECH 2019
AttnGAN: Fine-Grained Text to Image Generation With Attentional Generative Adversarial Networks
CVPR 2018
Deep Reinforcement Learning for NLP
ACL 2018
Tips and Tricks for Visual Question Answering: Learnings From the 2017 Challenge
CVPR 2018
CleanNet: Transfer Learning for Scalable Image Classifier Training With Label Noise
CVPR 2018
Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering
CVPR 2018
Stacked Cross Attention for Image-Text Matching
ECCV 2018
Policy Shaping and Generalized Update Equations for Semantic Parsing from Denotations
EMNLP 2018
On the Discrimination-Generalization Tradeoff in GANs
ICLR 2018
Discourse-Aware Neural Rewards for Coherent Text Generation
NAACL 2018
Tensor Product Generation Networks for Deep NLP Modeling
NAACL 2018
Deep Communicating Agents for Abstractive Summarization
NAACL 2018
Natural Language to Structured Query Generation via Meta-Learning
NAACL 2018
Deep Learning With Low Precision by Half-Wave Gaussian Quantization
CVPR 2017
Two-Stage Synthesis Networks for Transfer Learning in Machine Comprehension
EMNLP 2017
Learning Generic Sentence Representations Using Convolutional Neural Networks
EMNLP 2017
Adversarial Ranking for Language Generation
NIPS 2017
StyleNet: Generating Attractive Visual Captions With Styles
CVPR 2017
Semantic Compositional Networks for Visual Captioning
CVPR 2017
Character-Level Question Answering with Attention
EMNLP 2016
Stacked Attention Networks for Image Question Answering
CVPR 2016
Generating Natural Questions About an Image
ACL 2016
A Corpus and Cloze Evaluation for Deeper Understanding of Commonsense Stories
NAACL 2016
Visual Storytelling
NAACL 2016
Hierarchical Attention Networks for Document Classification
NAACL 2016
Deep Reinforcement Learning with a Natural Language Action Space
ACL 2016
Bi-directional Attention with Agreement for Dependency Parsing
EMNLP 2016
Deep Reinforcement Learning with a Combinatorial Action Space for Predicting Popular Reddit Threads
EMNLP 2016
Language Models for Image Captioning: The Quirks and What Works
IJCNLP 2015
End-to-end Learning of LDA by Mirror-Descent Back Propagation over a Deep Architecture
NIPS 2015
From Captions to Visual Concepts and Back
CVPR 2015
Deep Learning and Continuous Representations for Natural Language Processing
NAACL 2015
Semantic Parsing via Staged Query Graph Generation: Question Answering with Knowledge Base
ACL 2015
Language Models for Image Captioning: The Quirks and What Works
ACL 2015
Semantic Parsing via Staged Query Graph Generation: Question Answering with Knowledge Base
IJCNLP 2015
Representation Learning Using Multi-Task Deep Neural Networks for Semantic Classification and Information Retrieval
NAACL 2015
Semantic Parsing for Single-Relation Question Answering
ACL 2014
Modeling Interestingness with Deep Neural Networks
EMNLP 2014
Learning Continuous Phrase Representations for Translation Modeling
ACL 2014
Training MRF-Based Phrase Translation Models using Gradient Ascent
NAACL 2013
Maximum Expected BLEU Training of Phrase and Lexicon Translation Models
ACL 2012
Learning Lexicon Models from Search Logs for Query Expansion
CONLL 2012
Learning Lexicon Models from Search Logs for Query Expansion
EMNLP 2012
Domain Adaptation via Pseudo In-Domain Data Selection
EMNLP 2011
Joint Optimization for Machine Translation System Combination
EMNLP 2009
Using N-gram based Features for Machine Translation System Combination
NAACL 2009
Incremental HMM Alignment for MT System Combination
ACL 2009
Incremental HMM Alignment for MT System Combination
IJCNLP 2009
Indirect-HMM-based Hypothesis Alignment for Combining Outputs from Machine Translation Systems
EMNLP 2008