Li Dong
99 papers · 2014–2026 · 20 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+18 more ↓ Show less ↑
πΊοΈ Taxonomy Completionist (19) π§ Keyword Pioneer π Renaissance Researcher (5) π Interdisciplinary Bridge π£ Hot Topic Early Bird
π§
Keyword Pioneer
π
Renaissance Researcher
(5)
π
Interdisciplinary Bridge
π
Keyword Trendsetter Combo
(5)
π
Conference Loyalist
(26)
π±
Topic Pioneer
π€
Dynamic Duo
(79)
π
Grand Slam
π
Triple Crown
π¬
Deep Specialist
(17)
π§¬
Topic Evolution
π
Trend Setter
π₯
Unstoppable
(12)
π
Conference Pioneer
π
Century Club
(96)
β
The Questioner
(2)
ποΈ
Keyword Collector
(269)
β‘
Prolific Year
(9)
Conferences
ACL (27)
EMNLP (12)
NIPS (11)
ICLR (11)
IJCNLP (9)
AAAI (5)
CVPR (5)
ICML (4)
AACL (2)
EACL (2)
IJCAI (2)
SEMEVAL (1)
NAACL (1)
JMLR (1)
INTERSPEECH (1)
ICCV (1)
ECCV (1)
CONLL (1)
COLING (1)
AISTATS (1)
Top co-authors
Research topics
Keywords
zero-shot learning
(11)
cross-lingual transfer
(9)
language model
(9)
transfer learning
(8)
cross-lingual language model
(6)
large language model
(6)
multimodal learning
(6)
knowledge distillation
(6)
representation learning
(6)
transformer architecture
(5)
question answering
(5)
few-shot learning
(5)
multilingual model
(5)
text generation
(4)
data augmentation
(4)
in-context learning
(4)
text classification
(4)
named entity recognition
(4)
language modeling
(3)
adversarial attack
(3)
Papers
Towards Stable and Effective Reinforcement Learning for Mixture-of-Experts
ACL 2026
Induce, Align, Predict: Zero-Shot Stance Detection via Cognitive Inductive Reasoning
AAAI 2026
Deferred Poisoning: Making the Model More Vulnerable via Hessian Singularization
AAAI 2026
Data Selection via Optimal Control for Language Models
ICLR 2025
Differential Transformer
ICLR 2025
BitNet: 1-bit Pre-training for Large Language Models
JMLR 2025
New User Event Prediction Through the Lens of Causal Inference
AISTATS 2025
Learning Robust Image Watermarking with Lossless Cover Recovery
ICCV 2025
Semi-Parametric Retrieval via Binary Bag-of-Tokens Index
ICLR 2025
Imagine While Reasoning in Space: Multimodal Visualization-of-Thought
ICML 2025
Self-Boosting Large Language Models with Synthetic Preference Data
ICLR 2025
Kosmos-G: Generating Images in Context with Multimodal Large Language Models
ICLR 2024
Language Models as Inductive Reasoners
EACL 2024
BioCLIP: A Vision Foundation Model for the Tree of Life
CVPR 2024
EDDA: An Encoder-Decoder Data Augmentation Framework for Zero-Shot Stance Detection
COLING 2024
You Only Cache Once: Decoder-Decoder Architectures for Language Models
NIPS 2024
Mind's Eye of LLMs: Visualization-of-Thought Elicits Spatial Reasoning in Large Language Models
NIPS 2024
Multi-Head Mixture-of-Experts
NIPS 2024
Grounding Multimodal Large Language Models to the World
ICLR 2024
MiniLLM: Knowledge Distillation of Large Language Models
ICLR 2024
Pre-Training to Learn in Context
ACL 2023
Extensible Prompts for Language Models on Zero-shot Language Style Customization
NIPS 2023
Optimizing Prompts for Text-to-Image Generation
NIPS 2023
Language Is Not All You Need: Aligning Perception with Language Models
NIPS 2023
Augmenting Language Models with Long-Term Memory
NIPS 2023
GanLM: Encoder-Decoder Pre-training with an Auxiliary Discriminator
ACL 2023
A Length-Extrapolatable Transformer
ACL 2023
Beyond English-Centric Bitexts for Better Multilingual Language Representation Learning
ACL 2023
Why Can GPT Learn In-Context? Language Models Secretly Perform Gradient Descent as Meta-Optimizers
ACL 2023
Image as a Foreign Language: BEiT Pretraining for Vision and Vision-Language Tasks
CVPR 2023
Generic-to-Specific Distillation of Masked Autoencoders
CVPR 2023
Non-Contrastive Learning Meets Language-Image Pre-Training
CVPR 2023
Visually-Augmented Language Modeling
ICLR 2023
Prototypical Calibration for Few-shot Learning of Language Models
ICLR 2023
Corrupted Image Modeling for Self-Supervised Visual Pre-Training
ICLR 2023
Semi-Offline Reinforcement Learning for Optimized Text Generation
ICML 2023
Magneto: A Foundation Transformer
ICML 2023
Fake the Real: Backdoor Attack on Deep Speech Classification via Voice Conversion
INTERSPEECH 2023
XLM-E: Cross-lingual Language Model Pre-training via ELECTRA
ACL 2022
CLIP Models are Few-Shot Learners: Empirical Studies on VQA and Visual Entailment
ACL 2022
VLMo: Unified Vision-Language Pre-Training with Mixture-of-Modality-Experts
NIPS 2022
On the Representation Collapse of Sparse Mixture of Experts
NIPS 2022
BEiT: BERT Pre-Training of Image Transformers
ICLR 2022
AdaPrompt: Adaptive Model Training for Prompt-based NLP
EMNLP 2022
CROP: Zero-shot Cross-lingual Named Entity Recognition with Multilingual Labeled Sequence Translation
EMNLP 2022
Swin Transformer V2: Scaling Up Capacity and Resolution
CVPR 2022
THE-X: Privacy-Preserving Transformer Inference with Homomorphic Encryption
ACL 2022
Controllable Natural Language Generation with Contrastive Prefixes
ACL 2022
Knowledge Neurons in Pretrained Transformers
ACL 2022
StableMoE: Stable Routing Strategy for Mixture of Experts
ACL 2022
Improving Pretrained Cross-Lingual Language Models via Self-Labeled Word Alignment
IJCNLP 2021
Adapt-and-Distill: Developing Small, Fast and Effective Pretrained Language Models for Domains
IJCNLP 2021
MiniLMv2: Multi-Head Self-Attention Relation Distillation for Compressing Pretrained Transformers
IJCNLP 2021
Memory-Efficient Differentiable Transformer Architecture Search
IJCNLP 2021
Consistency Regularization for Cross-Lingual Fine-Tuning
ACL 2021
Learning to Sample Replacements for ELECTRA Pre-Training
IJCNLP 2021
A Semi-supervised Multi-task Learning Approach to Classify Customer Contact Intents
IJCNLP 2021
Zero-Shot Cross-Lingual Transfer of Neural Machine Translation with Multilingual Pretrained Encoders
EMNLP 2021
mT6: Multilingual Pretrained Text-to-Text Transformer with Translation Pairs
EMNLP 2021
Allocating Large Vocabulary Capacity for Cross-Lingual Language Model Pre-Training
EMNLP 2021
Multilingual Machine Translation Systems from Microsoft for WMT21 Shared Task
EMNLP 2021
InfoXLM: An Information-Theoretic Framework for Cross-Lingual Language Model Pre-Training
NAACL 2021
Self-Attention Attribution: Interpreting Information Interactions Inside Transformer
AAAI 2021
Consistency Regularization for Cross-Lingual Fine-Tuning
IJCNLP 2021
Improving Pretrained Cross-Lingual Language Models via Self-Labeled Word Alignment
ACL 2021
Adapt-and-Distill: Developing Small, Fast and Effective Pretrained Language Models for Domains
ACL 2021
MiniLMv2: Multi-Head Self-Attention Relation Distillation for Compressing Pretrained Transformers
ACL 2021
Memory-Efficient Differentiable Transformer Architecture Search
ACL 2021
Learning to Sample Replacements for ELECTRA Pre-Training
ACL 2021
A Semi-supervised Multi-task Learning Approach to Classify Customer Contact Intents
ACL 2021
MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers
NIPS 2020
Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks
ECCV 2020
Cross-Lingual Natural Language Generation via Pre-Training
AAAI 2020
Investigating Learning Dynamics of BERT Fine-Tuning
AACL 2020
Can Monolingual Pretrained Models Help Cross-Lingual Classification?
AACL 2020
Harvesting and Refining Question-Answer Pairs for Unsupervised QA
ACL 2020
UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training
ICML 2020
Data-to-text Generation with Entity Modeling
ACL 2019
Visualizing and Understanding the Effectiveness of BERT
IJCNLP 2019
Data-to-Text Generation with Content Selection and Planning
AAAI 2019
Learning a Unified Named Entity Tagger from Multiple Partially Annotated Corpora for Efficient Adaptation
CONLL 2019
Unified Language Model Pre-training for Natural Language Understanding and Generation
NIPS 2019
Learning to Ask Unanswerable Questions for Machine Reading Comprehension
ACL 2019
Inspecting Unification of Encoding and Matching with Transformer: A Case Study of Machine Reading Comprehension
EMNLP 2019
Visualizing and Understanding the Effectiveness of BERT
EMNLP 2019
Coarse-to-Fine Decoding for Neural Semantic Parsing
ACL 2018
Confidence Modeling for Neural Semantic Parsing
ACL 2018
Learning to Paraphrase for Question Answering
EMNLP 2017
Learning to Generate Product Reviews from Attributes
EACL 2017
Unsupervised Word and Dependency Path Embeddings for Aspect Term Extraction
IJCAI 2016
Solving and Generating Chinese Character Riddles
EMNLP 2016
Long Short-Term Memory-Networks for Machine Reading
EMNLP 2016
Language to Logical Form with Neural Attention
ACL 2016
Question Answering over Freebase with Multi-Column Convolutional Neural Networks
ACL 2015
Splusplus: A Feature-Rich Two-stage Classifier for Sentiment Analysis of Tweets
SEMEVAL 2015
Question Answering over Freebase with Multi-Column Convolutional Neural Networks
IJCNLP 2015
A Hybrid Neural Model for Type Classification of Entity Mentions
IJCAI 2015
A Joint Segmentation and Classification Framework for Sentiment Analysis
EMNLP 2014
Adaptive Recursive Neural Network for Target-dependent Twitter Sentiment Classification
ACL 2014