Yu Cheng
176 papers · 2013–2026 · 18 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+18 more ↓ Show less ↑
π§ Keyword Pioneer π Conference Polyglot (18) πΊοΈ Taxonomy Completionist (20) π Interdisciplinary Bridge π Academic Marathon (13)
π
Interdisciplinary Bridge
πΊοΈ
Taxonomy Completionist
(20)
π§
Keyword Pioneer
π
Keyword Trendsetter Combo
(3)
π
Conference Loyalist
(20)
π
Grand Slam
π
Triple Crown
π¬
Deep Specialist
(23)
π§¬
Topic Evolution
π
Keyword Champion
(2)
π€
Dynamic Duo
(34)
β
The Questioner
(5)
ποΈ
Keyword Collector
(633)
π
Century Club
(168)
π
Conference Pioneer
π₯
Unstoppable
(14)
β‘
Prolific Year
(28)
π
Trend Setter
Conferences
ACL (24)
NIPS (20)
CVPR (20)
EMNLP (20)
AAAI (19)
ICML (17)
ECCV (10)
ICLR (9)
NAACL (8)
ICCV (8)
IJCAI (5)
IJCNLP (5)
COLT (3)
WACV (3)
COLING (2)
EACL (1)
AISTATS (1)
OSDI (1)
Top co-authors
Research topics
Keywords
large language model
(18)
model compression
(15)
knowledge distillation
(9)
multimodal learning
(8)
representation learning
(7)
video understanding
(7)
mixture of expert
(7)
vision transformer
(6)
few-shot learning
(5)
reinforcement learning
(5)
generative adversarial network
(5)
language model
(5)
text generation
(5)
vision-language model
(5)
network pruning
(4)
human pose estimation
(4)
contrastive learning
(4)
adversarial learning
(4)
question answering
(4)
non-convex optimization
(4)
Papers
Less Is More: Vision Representation Compression for Efficient Video Generation with Large Language Models
AAAI 2026
One Refiner to Unlock Them All: Inference-Time Reasoning Elicitation via Reinforcement Query Refinement
ACL 2026
Enabling Agents to Communicate Entirely in Latent Space
ACL 2026
RFNNS: Robust Fixed Neural Network Steganography with Universal Text-to-Image Models
AAAI 2026
BREEN: Bridge Data-Efficient Encoder-Free Multimodal Learning with Learnable Queries
WACV 2026
Nirvana: A Specialized Generalist Model With Task-Aware Memory Mechanism
ACL 2026
Native Hybrid Attention for Efficient Sequence Modeling
ACL 2026
Scaling Reasoning, Losing Control: Evaluating Instruction Following in Large Reasoning Models
ACL 2026
TransMamba: A Sequence-Level Hybrid Transformer-Mamba Language Model
AAAI 2026
Unveiling Attractor Cycles in Large Language Models: A Dynamical Systems View of Successive Paraphrasing
ACL 2025
Cooperative or Competitive? Understanding the Interaction between Attention Heads From A Game Theory Perspective
ACL 2025
PRMBench: A Fine-grained and Challenging Benchmark for Process-Level Reward Models
ACL 2025
SEE: Continual Fine-tuning with Sequential Ensemble of Experts
ACL 2025
LeanVAE: An Ultra-Efficient Reconstruction VAE for Video Diffusion Models
ICCV 2025
Towards World Simulator: Crafting Physical Commonsense-Based Benchmark for Video Generation
ICML 2025
Occult: Optimizing Collaborative Communications across Experts for Accelerated Parallel MoE Training and Inference
ICML 2025
ImageGen-CoT: Enhancing Text-to-Image In-context Learning with Chain-of-Thought Reasoning
ICCV 2025
Make LoRA Great Again: Boosting LoRA with Adaptive Singular Values and Mixture-of-Experts Optimization Alignment
ICML 2025
Can MLLMs Reason in Multimodality? EMMA: An Enhanced MultiModal ReAsoning Benchmark
ICML 2025
Divide and Conquer: Grounding LLMs as Efficient Decision-Making Agents via Offline Hierarchical Reinforcement Learning
ICML 2025
Scaling Laws for FloatingβPoint Quantization Training
ICML 2025
OpenIAI-SNIO: A Systematic AR-Based Assembly Guidance System for Small-Scale, High-Density Industrial Components
IJCAI 2025
Dropping Experts, Recombining Neurons: Retraining-Free Pruning for Sparse Mixture-of-Experts LLMs
EMNLP 2025
Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models
EMNLP 2025
UltraIF: Advancing Instruction Following from the Wild
EMNLP 2025
Bit-Flip Error Resilience in LLMs: A Comprehensive Analysis and Defense Framework
EMNLP 2025
CLIP-MoE: Towards Building Mixture of Experts for CLIP with Diversified Multiplet Upcycling
EMNLP 2025
Training LLMs to be Better Text Embedders through Bidirectional Reconstruction
EMNLP 2025
Extrapolating and Decoupling Image-to-Video Generation Models: Motion Modeling is Easier Than You Think
CVPR 2025
From Head to Tail: Towards Balanced Representation in Large Vision-Language Models through Adaptive Data Calibration
CVPR 2025
LangBridge: Interpreting Image as a Combination of Language Embeddings
ICCV 2025
Modality-Specialized Synergizers for Interleaved Vision-Language Generalists
ICLR 2025
Dynamic Data Mixing Maximizes Instruction Tuning for Mixture-of-Experts
NAACL 2025
StickMotion: Generating 3D Human Motions by Drawing a Stickman
CVPR 2025
Continuous Speech Tokenizer in Text To Speech
NAACL 2025
PipeThreader: Software-Defined Pipelining for Efficient DNN Execution
OSDI 2025
Look, Compare, Decide: Alleviating Hallucination in Large Vision-Language Models via Multi-View Multi-Path Reasoning
COLING 2025
Weak to Strong Generalization for Large Language Models with Multi-capabilities
ICLR 2025
Diving into Self-Evolving Training for Multimodal Reasoning
ICML 2025
Test-Time Preference Optimization: On-the-Fly Alignment via Iterative Textual Feedback
ICML 2025
Liger: Linearizing Large Language Models to Gated Recurrent Structures
ICML 2025
Towards Stabilized and Efficient Diffusion Transformers through Long-Skip-Connections with Spectral Constraints
ICCV 2025
$\textttMoE-RBench$: Towards Building Reliable Language Models with Sparse Mixture-of-Experts
ICML 2024
Twin-Merging: Dynamic Integration of Modular Expertise in Model Merging
NIPS 2024
Sparse MoE with Language Guided Routing for Multilingual Machine Translation
ICLR 2024
Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing Policy
ICLR 2024
$\texttt{ConflictBank}$: A Benchmark for Evaluating the Influence of Knowledge Conflicts in LLMs
NIPS 2024
LIDAO: Towards Limited Interventions for Debiasing (Large) Language Models
ICML 2024
Unsupervised Domain Adaptative Temporal Sentence Localization with Mutual Information Maximization
AAAI 2024
Enhancing Low-Resource Relation Representations through Multi-View Decoupling
AAAI 2024
Multimodal Instruction Tuning with Conditional Mixture of LoRA
ACL 2024
Confidence is not Timeless: Modeling Temporal Validity for Rule-based Temporal Knowledge Graph Forecasting
ACL 2024
Living in the Moment: Can Large Language Models Grasp Co-Temporal Reasoning?
ACL 2024
Mitigating Boundary Ambiguity and Inherent Bias for Text Classification in the Era of Large Language Models
ACL 2024
Vision-Flan: Scaling Human-Labeled Tasks in Visual Instruction Tuning
ACL 2024
Towards Robust Temporal Activity Localization Learning with Noisy Labels
COLING 2024
ProS: Facial Omni-Representation Learning via Prototype-Based Self-Distillation
WACV 2024
Reinforcement Learning with Token-level Feedback for Controllable Text Generation
NAACL 2024
SynSP: Synergy of Smoothness and Precision in Pose Sequences Refinement
CVPR 2024
Rethinking Weakly-supervised Video Temporal Grounding From a Game Perspective
ECCV 2024
Domain-Adaptive 2D Human Pose Estimation via Dual Teachers in Extremely Low-Light Conditions
ECCV 2024
Learning the Unlearned: Mitigating Feature Suppression in Contrastive Learning
ECCV 2024
Aggregating Quantitative Relative Judgments: From Social Choice to Ranking Prediction
NIPS 2024
On Giant's Shoulders: Effortless Weak to Strong by Dynamic Logits Fusion
NIPS 2024
MAGIS: LLM-Based Multi-Agent Framework for GitHub Issue Resolution
NIPS 2024
SURf: Teaching Large Vision-Language Models to Selectively Utilize Retrieved Information
EMNLP 2024
LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-Training
EMNLP 2024
On the Universal Truthfulness Hyperplane Inside LLMs
EMNLP 2024
MoE-I2: Compressing Mixture of Experts Models through Inter-Expert Pruning and Intra-Expert Low-Rank Decomposition
EMNLP 2024
Unified Single-Stage Transformer Network for Efficient RGB-T Tracking
IJCAI 2024
DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models
NIPS 2023
DSFNet: Dual Space Fusion Network for Occlusion-Robust 3D Dense Face Alignment
CVPR 2023
You Are Catching My Attention: Are Vision Transformers Bad Learners Under Backdoor Attacks?
CVPR 2023
DSEE: Dually Sparsity-embedded Efficient Tuning of Pre-trained Language Models
ACL 2023
Hiding Data Helps: On the Benefits of Masking for Sparse Coding
ICML 2023
Hypotheses Tree Building for One-Shot Temporal Sentence Localization
AAAI 2023
Frido: Feature Pyramid Diffusion for Complex Scene Image Synthesis
AAAI 2023
Low-Switching Policy Gradient with Exploration via Online Sensitivity Sampling
ICML 2023
Annotations Are Not All You Need: A Cross-modal Knowledge Transfer Network for Unsupervised Temporal Sentence Grounding
EMNLP 2023
Robust Matrix Sensing in the Semi-Random Model
NIPS 2023
Robust Second-Order Nonconvex Optimization and Its Application to Low Rank Matrix Sensing
NIPS 2023
Robustness Challenges in Model Distillation and Pruning for Natural Language Understanding
EACL 2023
Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning
ICLR 2023
Local Byte Fusion for Neural Machine Translation
ACL 2023
Planning with Participation Constraints
AAAI 2022
The Principle of Diversity: Training Stronger Vision Transformers Calls for Reducing All Levels of Redundancy
CVPR 2022
Unsupervised Temporal Video Grounding with Deep Semantic Clustering
AAAI 2022
Outlier-Robust Sparse Estimation via Non-Convex Optimization
NIPS 2022
MΒ³ViT: Mixture-of-Experts Vision Transformer for Efficient Multi-task Learning with Model-Accelerator Co-design
NIPS 2022
Point Cloud Domain Adaptation via Masked Local 3D Structure Prediction
ECCV 2022
DNA: Improving Few-Shot Transfer Learning with Low-Rank Decomposition and Alignment
ECCV 2022
Scalable Learning to Optimize: A Learned Optimizer Can Train Big Models
ECCV 2022
Learning Visual Representation from Modality-Shared Contrastive Language-Image Pre-training
ECCV 2022
Memory-Guided Semantic Learning Network for Temporal Sentence Grounding
AAAI 2022
Playing Lottery Tickets with Vision and Language
AAAI 2022
RASAT: Integrating Relational Structures into Pretrained Seq2Seq Model for Text-to-SQL
EMNLP 2022
Rethinking the Video Sampling and Reasoning Strategies for Temporal Sentence Grounding
EMNLP 2022
Efficient Robust Training via Backward Smoothing
AAAI 2022
A Good Prompt Is Worth Millions of Parameters: Low-resource Prompt-based Learning for Vision-Language Models
ACL 2022
SemAttack: Natural Textual Attacks via Different Semantic Spaces
NAACL 2022
APo-VAE: Text Generation in Hyperbolic Space
NAACL 2021
Cluster-Former: Clustering-based Sparse Transformer for Question Answering
IJCNLP 2021
Classification with Few Tests through Self-Selection
AAAI 2021
EarlyBERT: Efficient BERT Training via Early-bird Lottery Tickets
ACL 2021
Monocular 3D Multi-Person Pose Estimation by Integrating Top-Down and Bottom-Up Networks
CVPR 2021
Fair for All: Best-effort Fairness Guarantees for Classification
AISTATS 2021
InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective
ICLR 2021
Few-Shot Object Detection via Classification Refinement and Distractor Retreatment
CVPR 2021
Chasing Sparsity in Vision Transformers: An End-to-End Exploration
NIPS 2021
Meta Module Network for Compositional Visual Reasoning
WACV 2021
Context-Aware Biaffine Localizing Network for Temporal Sentence Grounding
CVPR 2021
Data-Efficient GAN Training Beyond (Just) Augmentations: A Lottery Ticket Perspective
NIPS 2021
Cluster-Former: Clustering-based Sparse Transformer for Question Answering
ACL 2021
UC2: Universal Cross-Lingual Cross-Modal Vision-and-Language Pre-Training
CVPR 2021
The Elastic Lottery Ticket Hypothesis
NIPS 2021
EarlyBERT: Efficient BERT Training via Early-bird Lottery Tickets
IJCNLP 2021
Automated Mechanism Design for Classification with Partial Verification
AAAI 2021
Robust Learning of Fixed-Structure Bayesian Networks in Nearly-Linear Time
ICLR 2021
Graph and Temporal Convolutional Networks for 3D Multi-person Pose Estimation in Monocular Videos
AAAI 2021
Few-Shot Text Classification with Triplet Networks, Data Augmentation, and Curriculum Learning
NAACL 2021
Graph Optimal Transport for Cross-Domain Alignment
ICML 2020
Large-Scale Adversarial Training for Vision-and-Language Representation Learning
NIPS 2020
What Makes A Good Story? Designing Composite Rewards for Visual Storytelling
AAAI 2020
3D Human Pose Estimation Using Spatio-Temporal Networks with Explicit Occlusion Training
AAAI 2020
INSET: Sentence Infilling with INter-SEntential Transformer
ACL 2020
Discourse-Aware Neural Extractive Text Summarization
ACL 2020
Distilling Knowledge Learned in BERT for Text Generation
ACL 2020
Adversarial Robustness: From Self-Supervised Pre-Training to Fine-Tuning
CVPR 2020
BachGAN: High-Resolution Image Synthesis From Salient Object Layout
CVPR 2020
Violin: A Large-Scale Dataset for Video-and-Language Inference
CVPR 2020
Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models
ECCV 2020
Object Tracking using Spatio-Temporal Networks for Future Prediction Location
ECCV 2020
UNITER: UNiversal Image-TExt Representation Learning
ECCV 2020
Cross-Thought for Sentence Encoder Pre-training
EMNLP 2020
Contrastive Distillation on Intermediate Representations for Language Model Compression
EMNLP 2020
HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training
EMNLP 2020
Multi-Fact Correction in Abstractive Text Summarization
EMNLP 2020
Contextual Text Style Transfer
EMNLP 2020
FreeLB: Enhanced Adversarial Training for Natural Language Understanding
ICLR 2020
High-dimensional Robust Mean Estimation via Gradient Descent
ICML 2020
Faster Algorithms for High-Dimensional Robust Covariance Estimation
COLT 2019
Look across Elapse: Disentangled Representation Learning and Photorealistic Cross-Age Face Synthesis for Age-Invariant Face Recognition
AAAI 2019
Multi-step Reasoning via Recurrent Dual Attention for Visual Dialog
ACL 2019
Relation-Aware Graph Attention Network for Visual Question Answering
ICCV 2019
Occlusion-Aware Networks for 3D Human Pose Estimation in Video
ICCV 2019
Distinguishing Distributions When Samples Are Strategically Transformed
NIPS 2019
Domain Adaptive Text Style Transfer
IJCNLP 2019
Patient Knowledge Distillation for BERT Model Compression
IJCNLP 2019
Patient Knowledge Distillation for BERT Model Compression
EMNLP 2019
Domain Adaptive Text Style Transfer
EMNLP 2019
Adversarial Category Alignment Network for Cross-domain Sentiment Classification
NAACL 2019
StoryGAN: A Sequential Conditional GAN for Story Visualization
CVPR 2019
When Samples Are Strategically Selected
ICML 2019
A Better Algorithm for Societal Tradeoffs
AAAI 2019
Non-Convex Matrix Completion Against a Semi-Random Adversary
COLT 2018
Towards Pose Invariant Face Recognition in the Wild
CVPR 2018
3D-Aided Deep Pose-Invariant Face Recognition
IJCAI 2018
Dialog-based Interactive Image Retrieval
NIPS 2018
Diverse Few-Shot Text Classification with Multiple Metrics
NAACL 2018
Sobolev GAN
ICLR 2018
Robust Learning of Fixed-Structure Bayesian Networks
NIPS 2018
Fully-Adaptive Feature Sharing in Multi-Task Networks With Applications in Person Attribute Classification
CVPR 2017
MMD GAN: Towards Deeper Understanding of Moment Matching Network
NIPS 2017
Jointly Attentive Spatial-Temporal Pooling Networks for Video-Based Person Re-Identification
ICCV 2017
S3Pool: Pooling With Stochastic Spatial Sampling
CVPR 2017
Doubly Convolutional Neural Networks
NIPS 2016
Deep Structured Energy Based Models for Anomaly Detection
ICML 2016
Walk and Learn: Facial Attribute Representation Learning From Egocentric Video and Contextual Data
CVPR 2016
On the Recursive Teaching Dimension of VC Classes
NIPS 2016
An Exploration of Parameter Redundancy in Deep Networks With Circulant Projections
ICCV 2015
Reducing infrequent-token perplexity via variational corpora
IJCNLP 2015
Efficient Sampling for Gaussian Graphical Models via Spectral Sparsification
COLT 2015
Reducing infrequent-token perplexity via variational corpora
ACL 2015
Temporal Sequence Modeling for Video Event Detection
CVPR 2014
Detecting and Tracking Disease Outbreaks by Mining Social Media Data
IJCAI 2013
Forecast Oriented Classification of Spatio-Temporal Extreme Events
IJCAI 2013