Yu-Chiang Frank Wang
66 papers · 2013–2026 · 11 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+15 more ↓ Show less ↑
πΊοΈ Taxonomy Completionist (11) π§ Keyword Pioneer π Interdisciplinary Bridge π Renaissance Researcher (5) π Conference Polyglot (10)
π
Interdisciplinary Bridge
πΊοΈ
Taxonomy Completionist
(11)
π§
Keyword Pioneer
π
Conference Loyalist
(23)
π
Grand Slam
π
Triple Crown
π€
Dynamic Duo
(13)
π¬
Deep Specialist
(13)
π
Keyword Champion
(3)
π
Conference Pioneer
ποΈ
Keyword Collector
(250)
π
Trend Setter
β‘
Prolific Year
(18)
π
Century Club
(64)
π₯
Unstoppable
(12)
Conferences
CVPR (23)
ICCV (8)
ECCV (7)
AAAI (6)
WACV (6)
ICLR (5)
ACL (4)
NIPS (4)
ICML (1)
INTERSPEECH (1)
MIDL (1)
Top co-authors
Keywords
semantic segmentation
(7)
representation learning
(6)
adversarial learning
(5)
self-supervised learning
(4)
vision-language model
(4)
large language model
(4)
diffusion model
(4)
generative adversarial network
(3)
federated learning
(3)
neural radiance field
(3)
scene understanding
(3)
person re-identification
(3)
3d vision
(3)
mixture of expert
(3)
weakly supervised learning
(3)
cross-modal learning
(3)
domain adaptation
(3)
few-shot learning
(3)
video understanding
(3)
feature disentanglement
(3)
Papers
Mitigating Object and Action Hallucinations in Multimodal LLMs via Self-Augmented Contrastive Alignment
WACV 2026
Speech-Hands: A Self-Reflection Voice Agentic Approach to Speech Recognition and Audio Reasoning with Omni Perception
ACL 2026
TA-Prompting: Enhancing Video Large Language Models for Dense Video Captioning via Temporal Anchors
WACV 2026
3D Gaussian Inpainting with Depth-Guided Cross-View Consistency
CVPR 2025
Semantic Prompt Learning for Weakly-Supervised Semantic Segmentation
WACV 2025
Data-Efficient 3D Visual Grounding via Order-Aware Referring
WACV 2025
Histopathology Image Report Generation by Vision Language Model with Multimodal In-Context Learning
MIDL 2025
Serial Lifelong Editing via Mixture of Knowledge Experts
ACL 2025
NeKo: Cross-Modality Post-Recognition Error Correction with Tasks-Guided Mixture-of-Experts Language Model
ACL 2025
LOTUS: A Leaderboard for Detailed Image Captioning from Quality to Societal Bias and User Preferences
ACL 2025
SANER: Annotation-free Societal Attribute Neutralizer for Debiasing CLIP
ICLR 2025
UniWav: Towards Unified Pre-training for Speech Representation Learning and Generation
ICLR 2025
Continual Personalization for Diffusion Models
ICCV 2025
Bias in Gender Bias Benchmarks: How Spurious Features Distort Evaluation
ICCV 2025
VLsI: Verbalized Layers-to-Interactions from Large to Small Vision Language Models
CVPR 2025
Mosaic3D: Foundation Dataset and Model for Open-Vocabulary 3D Segmentation
CVPR 2025
Omni-RGPT: Unifying Image and Video Region-level Understanding via Token Marks
CVPR 2025
UWAV: Uncertainty-weighted Weakly-supervised Audio-Visual Video Parsing
CVPR 2025
VideoMage: Multi-Subject and Motion Customization of Text-to-Video Diffusion Models
CVPR 2025
Sparse Voxels Rasterization: Real-time High-fidelity Radiance Field Rendering
CVPR 2025
Dr. Splat: Directly Referring 3D Gaussian Splatting via Direct Language Embedding Registration
CVPR 2025
Segment Anything, Even Occluded
CVPR 2025
Receler: Reliable Concept Erasing of Text-to-Image Diffusion Models via Lightweight Erasers
ECCV 2024
ReXTime: A Benchmark Suite for Reasoning-Across-Time in Videos
NIPS 2024
Diffusion-Reward Adversarial Imitation Learning
NIPS 2024
Language-Guided Transformer for Federated Multi-Label Classification
AAAI 2024
GSNeRF: Generalizable Semantic Neural Radiance Fields with Enhanced 3D Scene Understanding
CVPR 2024
Seg2Reg: Differentiable 2D Segmentation to 1D Regression Rendering for 360 Room Layout Reconstruction
CVPR 2024
TPA3D: Triplane Attention for Fast Text-to-3D Generation
ECCV 2024
Select and Distill: Selective Dual-Teacher Knowledge Transfer for Continual Learning on Vision-Language Models
ECCV 2024
SAM4MLLM: Enhance Multi-Modal Large Language Model for Referring Expression Segmentation
ECCV 2024
RAPPER: Reinforced Rationale-Prompted Paradigm for Natural Language Explanation in Visual Question Answering
ICLR 2024
Self-Supervised Speech Quality Estimation and Enhancement Using Only Clean Speech
ICLR 2024
DoRA: Weight-Decomposed Low-Rank Adaptation
ICML 2024
DeSTA: Enhancing Speech Language Models through Descriptive Speech-Text Alignment
INTERSPEECH 2024
Target-Free Text-Guided Image Manipulation
AAAI 2023
Frido: Feature Pyramid Diffusion for Complex Scene Image Synthesis
AAAI 2023
Efficient Model Personalization in Federated Learning via Client-Specific Prompt Generation
ICCV 2023
Bias-Eliminating Augmentation Learning for Debiased Federated Learning
CVPR 2023
Self-Supervised Pyramid Representation Learning for Multi-Label Visual Analysis and Beyond
WACV 2023
A Pixel-Level Meta-Learner for Weakly Supervised Few-Shot Semantic Segmentation
WACV 2022
Cross-Modal Mutual Learning for Audio-Visual Speech Recognition and Manipulation
AAAI 2022
NeurMiPs: Neural Mixture of Planar Experts for View Synthesis
CVPR 2022
Scene Graph Expansion for Semantics-Guided Image Outpainting
CVPR 2022
Adversarial Teacher-Student Representation Learning for Domain Generalization
NIPS 2021
Exploiting Audio-Visual Consistency with Partial Supervision for Spatial Audio Generation
AAAI 2021
LayoutTransformer: Scene Layout Generation With Conceptual and Spatial Diversity
CVPR 2021
Convolution in the Cloud: Learning Deformable Kernels in 3D Graph Convolution Networks for Point Cloud Analysis
CVPR 2020
Learning Identity-Invariant Motion Representations for Cross-ID Face Reenactment
CVPR 2020
Learning to Learn in a Semi-Supervised Fashion
ECCV 2020
Towards Scene Understanding: Unsupervised Monocular Depth Estimation With Semantic-Aware Representation
CVPR 2019
Recover and Identify: A Generative Dual Model for Cross-Resolution Person Re-Identification
ICCV 2019
Cross-Dataset Person Re-Identification via Unsupervised Pose Disentanglement and Adaptation
ICCV 2019
A Closer Look at Few-shot Classification
ICLR 2019
Learning Resolution-Invariant Deep Representations for Person Re-Identification
AAAI 2019
Spot and Learn: A Maximum-Entropy Patch Sampler for Few-Shot Image Classification
CVPR 2019
Deep Generative Models for Weakly-Supervised Multi-Label Classification
ECCV 2018
A Unified Feature Disentangler for Multi-Domain Image Translation and Manipulation
NIPS 2018
Detach and Adapt: Learning Cross-Domain Disentangled Deep Representation
CVPR 2018
Summarizing First-Person Videos from Third Persons' Points of View
ECCV 2018
Multi-Label Zero-Shot Learning With Structured Knowledge Graphs
CVPR 2018
No More Discrimination: Cross City Adaptation of Road Scene Segmenters
ICCV 2017
Learning Cross-Domain Landmarks for Heterogeneous Domain Adaptation
CVPR 2016
Propagated Image Filtering
CVPR 2015
Unsupervised Domain Adaptation With Imbalanced Cross-Domain Data
ICCV 2015
Coupled Dictionary and Feature Space Learning with Applications to Cross-Domain Image Synthesis and Recognition
ICCV 2013