Yezhou Yang
56 papers · 2011–2026 · 13 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+14 more ↓ Show less ↑
π Conference Polyglot (13) π Academic Marathon (15) π§ Keyword Pioneer π Interdisciplinary Bridge π Cross-Pollinator (14)
π
Cross-Pollinator
(14)
π
Renaissance Researcher
(8)
πΊοΈ
Taxonomy Completionist
(93)
π
Keyword Champion
(3)
π¬
Deep Specialist
(14)
π€
Dynamic Duo
(24)
π
Grand Slam
π
Conference Pioneer
ποΈ
Keyword Collector
(194)
π
Century Club
(56)
β‘
Prolific Year
(11)
π₯
Unstoppable
(9)
β
The Questioner
(2)
π
Trend Setter
Conferences
CVPR (10)
EMNLP (9)
ACL (6)
ECCV (6)
WACV (6)
ICCV (5)
ICLR (3)
IJCNLP (3)
AAAI (2)
IJCAI (2)
NAACL (2)
ICML (1)
NIPS (1)
Top co-authors
Keywords
diffusion model
(8)
visual question answering
(7)
object detection
(7)
multimodal learning
(6)
vision-language model
(6)
action recognition
(4)
knowledge distillation
(4)
image captioning
(4)
contrastive learning
(3)
data augmentation
(3)
text-to-image diffusion
(3)
domain generalization
(3)
text-to-image generation
(3)
image generation
(2)
depth estimation
(2)
multi-modal learning
(2)
adversarial robustness
(2)
image editing
(2)
weakly supervised learning
(2)
adversarial learning
(2)
Papers
VOCAL: Visual Odometry via ContrAstive Learning
WACV 2026
Event-based Graph Representation with Spatial and Motion Vectors for Asynchronous Object Detection
WACV 2026
VOILA: Evaluation of MLLMs For Perceptual Understanding and Analogical Reasoning
ICLR 2025
DeepShade: Enable Shade Simulation by Text-conditioned Image Generation
IJCAI 2025
AcT2I: Evaluating and Improving Action Depiction in Text-to-Image Models
EMNLP 2025
RefEdit: A Benchmark and Method for Improving Instruction-based Image Editing Model on Referring Expressions
ICCV 2025
Deep Geometric Moments Promote Shape Consistency in Text-to-3D Generation
WACV 2025
FlowChef: Steering of Rectified Flow Models for Controlled Generations
ICCV 2025
WOUAF: Weight Modulation for User Attribution and Fingerprinting in Text-to-Image Diffusion Models
CVPR 2024
ECLIPSE: A Resource-Efficient Text-to-Image Prior for Image Generations
CVPR 2024
On the Robustness of Language Guidance for Low-Level Vision Tasks: Findings from Depth Estimation
CVPR 2024
R.A.C.E.: Robust Adversarial Concept Erasure for Secure Text-to-Image Diffusion Model
ECCV 2024
Precision or Recall? An Analysis of Image Captions for Training Text-to-Image Generation Model
EMNLP 2024
TROPE: TRaining-Free Object-Part Enhancement for Seamlessly Improving Fine-Grained Zero-Shot Image Captioning
EMNLP 2024
Towards Addressing the Misalignment of Object Proposal Evaluation for Vision-Language Tasks via Semantic Grounding
WACV 2024
ConceptBed: Evaluating Concept Learning Abilities of Text-to-Image Diffusion Models
AAAI 2024
Lost in Translation? Translation Errors and Challenges for Fair Assessment of Text-to-Image Models on Multilingual Concepts
NAACL 2024
TripletCLIP: Improving Compositional Reasoning of CLIP via Synthetic Vision-Language Negatives
NIPS 2024
eTraM: Event-based Traffic Monitoring Dataset
CVPR 2024
Getting it Right: Improving Spatial Consistency in Text-to-Image Models
ECCV 2024
REVISION: Rendering Tools Enable Spatial Fidelity in Vision-Language Models
ECCV 2024
Adversarial Bayesian Augmentation for Single-Source Domain Generalization
ICCV 2023
End-to-end Knowledge Retrieval with Multi-modal Queries
ACL 2023
Attributing Image Generative Models using Latent Fingerprints
ICML 2023
Improving Diversity With Adversarially Learned Transformations for Domain Generalization
WACV 2023
Injecting Semantic Concepts Into End-to-End Image Captioning
CVPR 2022
To Find Waldo You Need Contextual Cues: Debiasing Whoβs Waldo
ACL 2022
Semantically Distributed Robust Optimization for Vision-and-Language Inference
ACL 2022
CRIPP-VQA: Counterfactual Reasoning about Implicit Physical Properties via Video Question Answering
EMNLP 2022
Learning Action-Effect Dynamics for Hypothetical Vision-Language Reasoning Task
EMNLP 2022
SMURF: SeMantic and linguistic UndeRstanding Fusion for Caption Evaluation via Typicality Analysis
IJCNLP 2021
CLEVR_HYP: A Challenge Dataset and Baselines for Visual Question Answering with Hypothetical Actions over Images
NAACL 2021
Decentralized Attribution of Generative Models
ICLR 2021
SEED: Self-supervised Distillation For Visual Representation
ICLR 2021
Hierarchical and Partially Observable Goal-Driven Policy Learning With Goals Relational Graph
CVPR 2021
Attribute-Guided Adversarial Training for Robustness to Natural Perturbations
AAAI 2021
WeaQA: Weak Supervision via Captions for Visual Question Answering
ACL 2021
WeaQA: Weak Supervision via Captions for Visual Question Answering
IJCNLP 2021
Compressing Visual-Linguistic Model via Knowledge Distillation
ICCV 2021
Weakly Supervised Relative Spatial Reasoning for Visual Question Answering
ICCV 2021
SMURF: SeMantic and linguistic UndeRstanding Fusion for Caption Evaluation via Typicality Analysis
ACL 2021
Video2Commonsense: Generating Commonsense Descriptions to Enrich Video Captioning
EMNLP 2020
Visuo-Linguistic Question Answering (VLQA) Challenge
EMNLP 2020
MUTANT: A Training Paradigm for Out-of-Distribution Generalization in Visual Question Answering
EMNLP 2020
TKD: Temporal Knowledge Distillation for Active Perception
WACV 2020
VQA-LOL: Visual Question Answering under the Lens of Logic
ECCV 2020
ViTAA: Visual-Textual Attributes Alignment in Person Search by Natural Language
ECCV 2020
Integrating Knowledge and Reasoning in Image Understanding
IJCAI 2019
Modularized Textual Grounding for Counterfactual Resilience
CVPR 2019
Transductive Unbiased Embedding for Zero-Shot Learning
CVPR 2018
Stroke Controllable Fast Style Transfer with Adaptive Receptive Fields
ECCV 2018
Grasp Type Revisited: A Modern Perspective on a Classical Feature for Vision
CVPR 2015
Learning the Semantics of Manipulation Action
ACL 2015
Learning the Semantics of Manipulation Action
IJCNLP 2015
Detection of Manipulation Action Consequences (MAC)
CVPR 2013
Corpus-Guided Sentence Generation of Natural Images
EMNLP 2011