Chunyuan Li
85 papers · 2014–2025 · 13 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+18 more ↓ Show less ↑
πΊοΈ Taxonomy Completionist (21) π§ Keyword Pioneer π Interdisciplinary Bridge π Renaissance Researcher (6) π£ Hot Topic Early Bird
π
Interdisciplinary Bridge
π
Academic Marathon
(11)
πΊοΈ
Taxonomy Completionist
(21)
π
Keyword Trendsetter Combo
(5)
π
Grand Slam
π
Triple Crown
π€
Dynamic Duo
(33)
π₯
Mega-Team
(71)
π±
Topic Pioneer
π¬
Deep Specialist
(18)
π§¬
Topic Evolution
π
Keyword Champion
(2)
ποΈ
Keyword Collector
(314)
π
Century Club
(85)
π
Trend Setter
π₯
Unstoppable
(12)
β‘
Prolific Year
(9)
π
Conference Pioneer
Conferences
NIPS (16)
CVPR (14)
EMNLP (11)
ICLR (8)
ACL (6)
ECCV (6)
AAAI (5)
AISTATS (5)
ICML (5)
IJCNLP (3)
NAACL (3)
ICCV (2)
IJCAI (1)
Top co-authors
Keywords
zero-shot learning
(10)
few-shot learning
(9)
variational autoencoder
(8)
transfer learning
(8)
multimodal learning
(8)
vision-language model
(7)
object detection
(7)
generative adversarial network
(5)
pre-trained language model
(5)
semantic segmentation
(5)
adversarial learning
(4)
text generation
(4)
instruction following
(4)
vision transformer
(4)
semi-supervised learning
(4)
image classification
(4)
language modeling
(4)
image generation
(4)
convolutional neural network
(4)
large multimodal model
(4)
Papers
LLaVA-Critic: Learning to Evaluate Multimodal Models
CVPR 2025
Direct Preference Optimization of Video Large Multimodal Models from Language Model Reward
NAACL 2025
MMSearch: Unveiling the Potential of Large Models as Multi-modal Search Engines
ICLR 2025
LLaVA-Interleave: Tackling Multi-image, Video, and 3D in Large Multimodal Models
ICLR 2025
Painting with Words: Elevating Detailed Image Captioning with Benchmark and Alignment Learning
ICLR 2025
Graphic Design with Large Multimodal Model
AAAI 2025
MuirBench: A Comprehensive Benchmark for Robust Multi-image Understanding
ICLR 2025
LMMs-Eval: Reality Check on the Evaluation of Large Multimodal Models
NAACL 2025
Seeing the Image: Prioritizing Visual Correlation by Contrastive Alignment
NIPS 2024
LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents
ECCV 2024
Segment and Recognize Anything at Any Granularity
ECCV 2024
MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual Contexts
ICLR 2024
Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection
ECCV 2024
LLaVA-Grounding: Grounded Visual Chat with Large Multimodal Models
ECCV 2024
Improved Baselines with Visual Instruction Tuning
CVPR 2024
Visual In-Context Prompting
CVPR 2024
Aligning Large Multimodal Models with Factually Augmented RLHF
ACL 2024
Position: TrustLLM: Trustworthiness in Large Language Models
ICML 2024
Visual Instruction Tuning
NIPS 2023
GLIGEN: Open-Set Grounded Text-to-Image Generation
CVPR 2023
Learning Customized Visual Models With Retrieval-Augmented Knowledge
CVPR 2023
A Simple Framework for Open-Vocabulary Segmentation and Detection
ICCV 2023
LLaVA-Med: Training a Large Language-and-Vision Assistant for Biomedicine in One Day
NIPS 2023
Generalized Decoding for Pixel, Image, and Language
CVPR 2023
Large Language Models are Visual Reasoning Coordinators
NIPS 2023
Scaling Vision-Language Models with Sparse Mixture of Experts
EMNLP 2023
Parameter-Efficient Model Adaptation for Vision Transformers
AAAI 2023
ELEVATER: A Benchmark and Toolkit for Evaluating Language-Augmented Visual Models
NIPS 2022
Focal Modulation Networks
NIPS 2022
K-LITE: Learning Transferable Visual Models with External Knowledge
NIPS 2022
Grounded Language-Image Pre-Training
CVPR 2022
RegionCLIP: Region-Based Language-Image Pretraining
CVPR 2022
Towards Language-Free Training for Text-to-Image Generation
CVPR 2022
Unified Contrastive Learning in Image-Text-Label Space
CVPR 2022
Efficient Self-supervised Vision Transformers for Representation Learning
ICLR 2022
RADDLE: An Evaluation Benchmark and Analysis Platform for Robust Task-oriented Dialog Systems
ACL 2021
Focal Attention for Long-Range Interactions in Vision Transformers
NIPS 2021
RADDLE: An Evaluation Benchmark and Analysis Platform for Robust Task-oriented Dialog Systems
IJCNLP 2021
Exploring Robustness of Unsupervised Domain Adaptation in Semantic Segmentation
ICCV 2021
Rethinking Sentiment Style Transfer
EMNLP 2021
Few-Shot Named Entity Recognition: An Empirical Baseline Study
EMNLP 2021
Hierarchical Graph Capsule Network
AAAI 2021
Partition-Guided GANs
CVPR 2021
Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks
ECCV 2020
Structure-Aware Human-Action Generation
ECCV 2020
Complementary Auxiliary Classifiers for Label-Conditional Text Generation
AAAI 2020
Repulsive Attention: Rethinking Multi-head Attention as Bayesian Inference
EMNLP 2020
Optimus: Organizing Sentences via Pre-trained Modeling of a Latent Space
EMNLP 2020
POINTER: Constrained Progressive Text Generation via Insertion-based Generative Pre-training
EMNLP 2020
Improving Text Generation with Student-Forcing Optimal Transport
EMNLP 2020
Few-shot Natural Language Generation for Task-Oriented Dialog
EMNLP 2020
Cyclical Stochastic Gradient MCMC for Bayesian Deep Learning
ICLR 2020
Feature Quantization Improves GAN Training
ICML 2020
Towards Learning a Generic Agent for Vision-and-Language Navigation via Pre-Training
CVPR 2020
Robust Navigation with Language Pretraining and Stochastic Sampling
EMNLP 2019
Robust Navigation with Language Pretraining and Stochastic Sampling
IJCNLP 2019
Adversarial Learning of a Sampler Based on an Unnormalized Distribution
AISTATS 2019
Communication-Efficient Stochastic Gradient MCMC for Neural Networks
AAAI 2019
Cyclical Annealing Schedule: A Simple Approach to Mitigating KL Vanishing
NAACL 2019
Implicit Deep Latent Variable Models for Text Generation
IJCNLP 2019
Twin Auxilary Classifiers GAN
NIPS 2019
DoubleTransfer at MEDIQA 2019: Multi-Source Transfer Learning for Natural Language Understanding in the Medical Domain
ACL 2019
Implicit Deep Latent Variable Models for Text Generation
EMNLP 2019
Baseline Needs More Love: On Simple Word-Embedding-Based Models and Associated Pooling Mechanisms
ACL 2018
Measuring the Intrinsic Dimension of Objective Landscapes
ICLR 2018
Adversarial Time-to-Event Modeling
ICML 2018
Continuous-Time Flows for Efficient Inference and Density Estimation
ICML 2018
Joint Embedding of Words and Labels for Text Classification
ACL 2018
Policy Optimization as Wasserstein Gradient Flows
ICML 2018
Learning Structural Weight Uncertainty for Sequential Decision-Making
AISTATS 2018
Symmetric Variational Autoencoder and Connections to Adversarial Learning
AISTATS 2018
ALICE: Towards Understanding Adversarial Learning for Joint Distribution Matching
NIPS 2017
Adversarial Symmetric Variational Autoencoder
NIPS 2017
Scalable Bayesian Learning of Recurrent Neural Networks for Language Modeling
ACL 2017
Learning Generic Sentence Representations Using Convolutional Neural Networks
EMNLP 2017
VAE Learning via Stein Variational Gradient Descent
NIPS 2017
Triangle Generative Adversarial Networks
NIPS 2017
A Deep Generative Deconvolutional Image Model
AISTATS 2016
Bridging the Gap between Stochastic Gradient MCMC and Stochastic Optimization
AISTATS 2016
Learning Weight Uncertainty With Stochastic Gradient MCMC for Shape Classification
CVPR 2016
Bayesian Dictionary Learning with Gaussian Processes and Sigmoid Belief Networks
IJCAI 2016
Stochastic Gradient MCMC with Stale Gradients
NIPS 2016
Variational Autoencoder for Deep Learning of Images, Labels and Captions
NIPS 2016
Deep Temporal Sigmoid Belief Networks for Sequence Modeling
NIPS 2015
Persistence-based Structural Recognition
CVPR 2014