Yi Zhu
66 papers · 2017–2025 · 18 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+14 more ↓ Show less ↑
π Conference Polyglot (18) π Academic Marathon (8) π Interdisciplinary Bridge π§ Keyword Pioneer π Cross-Pollinator (7)
π
Renaissance Researcher
(10)
π£
Hot Topic Early Bird
π
Conference Polyglot
(18)
π€
Dynamic Duo
(14)
π
Grand Slam
π₯
Mega-Team
(30)
π¬
Deep Specialist
(11)
π§¬
Topic Evolution
ποΈ
Keyword Collector
(317)
π
Trend Setter
β‘
Prolific Year
(13)
π
Conference Pioneer
π₯
Unstoppable
(9)
π
Century Club
(66)
Conferences
CVPR (12)
NIPS (10)
ICCV (9)
WACV (5)
ICLR (4)
EMNLP (4)
ACL (4)
AAAI (4)
NAACL (3)
COLING (2)
ICML (2)
ECCV (1)
EACL (1)
CONLL (1)
IJCNLP (1)
INTERSPEECH (1)
JMLR (1)
OSDI (1)
Top co-authors
Keywords
semantic segmentation
(9)
large language model
(6)
contrastive learning
(6)
vision-language navigation
(5)
domain adaptation
(4)
action recognition
(3)
convolutional neural network
(3)
reinforcement learning
(3)
multi-modal learning
(3)
cross-modal learning
(3)
object localization
(3)
speech synthesis
(3)
transfer learning
(3)
self-supervised learning
(3)
text generation
(3)
instance segmentation
(3)
zero-shot learning
(2)
few-shot learning
(2)
representation learning
(2)
text classification
(2)
Papers
CAP-Net: A Unified Network for 6D Pose and Size Estimation of Categorical Articulated Parts from a Single RGB-D Image
CVPR 2025
DisCo: Discovering Common Affordance from Large Models for Actionable Part Perception
WACV 2025
Post-Hoc Watermarking for Robust Detection in Text Generated by Large Language Models
COLING 2025
AI4Reading: Chinese Audiobook Interpretation System Based on Multi-Agent Collaboration
ACL 2025
Collaborative Document Simplification Using Multi-Agent Systems
COLING 2025
Differential Transformer
ICLR 2025
Enabling Self-Improving Agents to Learn at Test Time With Human-In-The-Loop Guidance
EMNLP 2025
EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions
CVPR 2025
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking
ICML 2025
UNIT: Unifying Image and Text Recognition in One Vision Encoder
NIPS 2024
nnScaler: Constraint-Guided Parallelization Plan Generation for Deep Learning Training
OSDI 2024
You Only Cache Once: Decoder-Decoder Architectures for Language Models
NIPS 2024
VidMan: Exploiting Implicit Dynamics from Video Diffusion Model for Effective Robot Manipulation
NIPS 2024
SLIM: Style-Linguistics Mismatch Model for Generalized Audio Deepfake Detection
NIPS 2024
ParaLS: Lexical Substitution via Pretrained Paraphraser
ACL 2023
Prompt Pre-Training with Twenty-Thousand Classes for Open-Vocabulary Visual Recognition
NIPS 2023
PreDiff: Precipitation Nowcasting with Latent Diffusion Models
NIPS 2023
Actional Atomic-Concept Learning for Demystifying Vision-Language Navigation
AAAI 2023
Tailoring Instructions to Studentβs Learning Levels Boosts Knowledge Distillation
ACL 2023
Chinese Lexical Substitution: Dataset and Method
EMNLP 2023
MixReorg: Cross-Modal Mixed Patch Reorganization is a Good Mask Learner for Open-World Semantic Segmentation
ICCV 2023
Motion-Guided Masking for Spatiotemporal Representation Learning
ICCV 2023
Towards Geospatial Foundation Models via Continual Pretraining
ICCV 2023
ViewCo: Discovering Text-Supervised Segmentation Masks via Multi-View Semantic Consistency
ICLR 2023
AIM: Adapting Image Models for Efficient Video Action Recognition
ICLR 2023
Unsupervised Semantic Segmentation with Self-supervised Object-centric Representations
ICLR 2023
ImpDet: Exploring Implicit Fields for 3D Object Detection
WACV 2023
RelCLIP: Adapting Language-Image Pretraining for Visual Relationship Detection via Relational Contrastive Learning
EMNLP 2022
Earthformer: Exploring Space-Time Transformers for Earth System Forecasting
NIPS 2022
Partial and Asymmetric Contrastive Learning for Out-of-Distribution Detection in Long-Tailed Recognition
ICML 2022
NUTA: Non-Uniform Temporal Aggregation for Action Recognition
WACV 2022
CoupAlign: Coupling Word-Pixel with Sentence-Mask Alignments for Referring Image Segmentation
NIPS 2022
Contrastive Instruction-Trajectory Learning for Vision-Language Navigation
AAAI 2022
ADAPT: Vision-Language Navigation With Modality-Aligned Action Prompts
CVPR 2022
Cross-modal Transfer Learning via Multi-grained Alignment for End-to-End Spoken Language Understanding
INTERSPEECH 2022
Learning Canonical F-Correlation Projection for Compact Multiview Representation
CVPR 2022
Domain Consensus Clustering for Universal Domain Adaptation
CVPR 2021
SOON: Scenario Oriented Object Navigation With Graph-Based Exploration
CVPR 2021
A Closer Look at Few-Shot Crosslingual Transfer: The Choice of Shots Matters
ACL 2021
Combining Deep Generative Models and Multi-lingual Pretraining for Semi-supervised Document Classification
EACL 2021
An Unsupervised Method for Building Sentence Simplification Corpora in Multiple Languages
EMNLP 2021
CrossCLR: Cross-Modal Contrastive Learning for Multi-Modal Video Representations
ICCV 2021
VidTr: Video Transformer Without Convolutions
ICCV 2021
Self-Motivated Communication Agent for Real-World Vision-Dialog Navigation
ICCV 2021
CrossNorm and SelfNorm for Generalization Under Distribution Shifts
ICCV 2021
Scale Aware Adaptation for Land-Cover Classification in Remote Sensing Imagery
WACV 2021
Progressive Coordinate Transforms for Monocular 3D Object Detection
NIPS 2021
A Closer Look at Few-Shot Crosslingual Transfer: The Choice of Shots Matters
IJCNLP 2021
Blending Anti-Aliasing into Vision Transformer
NIPS 2021
GluonCV and GluonNLP: Deep Learning in Computer Vision and Natural Language Processing
JMLR 2020
Lexical Simplification with Pretrained Encoders
AAAI 2020
Vision-Dialog Navigation by Exploring Cross-Modal Memory
CVPR 2020
Cross-Time and Orientation-Invariant Overhead Image Geolocalization Using Deep Local Features
WACV 2020
Vision-Language Navigation With Self-Supervised Auxiliary Reasoning Tasks
CVPR 2020
Motion-Excited Sampler: Video Adversarial Attack with Sparked Prior
ECCV 2020
Bayesian Learning for Neural Dependency Parsing
NAACL 2019
Tensor Decomposition for Multilayer Networks Clustering
AAAI 2019
Selective Sparse Sampling for Fine-Grained Image Recognition
ICCV 2019
A Systematic Study of Leveraging Subword Information for Learning Word Representations
NAACL 2019
Learning Instance Activation Maps for Weakly Supervised Instance Segmentation
CVPR 2019
On the Importance of Subword Information for Morphological Tasks in Truly Low-Resource Languages
CONLL 2019
Improving Semantic Segmentation via Video Propagation and Label Relaxation
CVPR 2019
Parsing Tweets into Universal Dependencies
NAACL 2018
Weakly Supervised Instance Segmentation Using Class Peak Response
CVPR 2018
Towards Universal Representation for Unseen Action Recognition
CVPR 2018
Soft Proposal Networks for Weakly Supervised Object Localization
ICCV 2017