Chaoyou Fu
21 papers · 2019–2026 · 8 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+10 more ↓ Show less ↑
π Renaissance Researcher (6) π Interdisciplinary Bridge π Conference Polyglot (6) π Academic Marathon (6) πΊοΈ Taxonomy Completionist (42)
π§
Keyword Pioneer
π
Cross-Pollinator
(15)
πΊοΈ
Taxonomy Completionist
(42)
π
Grand Slam
π₯
Mega-Team
(21)
β
The Questioner
π₯
Unstoppable
(7)
ποΈ
Keyword Collector
(81)
π
Century Club
(19)
β‘
Prolific Year
(7)
Conferences
CVPR (8)
NIPS (4)
ICML (3)
ICLR (2)
AAAI (1)
ACL (1)
ICCV (1)
IJCAI (1)
Top co-authors
Keywords
image generation
(3)
semantic segmentation
(2)
disentangled representation
(2)
multi-modal learning
(2)
video understanding
(2)
domain adaptation
(2)
heterogeneous face recognition
(2)
multimodal large language model
(2)
identity swapping
(2)
object detection
(2)
latent space
(2)
unsupervised learning
(2)
noisy label learning
(1)
video captioning
(1)
optimal transport
(1)
open-vocabulary detection
(1)
transfer learning
(1)
information bottleneck
(1)
few-shot learning
(1)
face recognition
(1)
Papers
QuoTA: Query-oriented Token Assignment via CoT Query Decouple for Long Video Comprehension
AAAI 2026
Scaling Law for Multimodal Large Language Model Supervised Fine-Tuning
ACL 2026
InstanceCap: Improving Text-to-Video Generation via Instance-aware Structured Caption
CVPR 2025
Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
CVPR 2025
Learning Interleaved Image-Text Comprehension in Vision-Language Large Models
ICLR 2025
MME-RealWorld: Could Your Multimodal LLM Challenge High-Resolution Real-World Scenarios that are Difficult for Humans?
ICLR 2025
MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency
ICML 2025
Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM
ICML 2025
MM-RLHF: The Next Step Forward in Multimodal LLM Alignment
ICML 2025
No Time to Train: Empowering Non-Parametric Networks for Few-shot 3D Scene Segmentation
CVPR 2024
Aligning and Prompting Everything All at Once for Universal Visual Perception
CVPR 2024
Multi-modal Queried Object Detection in the Wild
NIPS 2023
CAPro: Webly Supervised Learning with Cross-modality Aligned Prototypes
NIPS 2023
Rethinking Image Cropping: Exploring Diverse Compositions From Global Views
CVPR 2022
CM-NAS: Cross-Modality Neural Architecture Search for Visible-Infrared Person Re-Identification
ICCV 2021
Information Bottleneck Disentanglement for Identity Swapping
CVPR 2021
Pareidolia Face Reenactment
CVPR 2021
AOT: Appearance Optimal Transport Based Identity Swapping for Forgery Detection
NIPS 2020
Cross-Spectral Face Hallucination via Disentangling Independent Factors
CVPR 2020
Dual Variational Generation for Low Shot Heterogeneous Face Recognition
NIPS 2019
Neurons Merging Layer: Towards Progressive Redundancy Reduction for Deep Supervised Hashing
IJCAI 2019