Song Bai
58 papers · 2016–2026 · 9 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+14 more ↓ Show less ↑
π Interdisciplinary Bridge π Academic Marathon (10) π Conference Polyglot (8) π Renaissance Researcher (8) πΊοΈ Taxonomy Completionist (91)
πΊοΈ
Taxonomy Completionist
(91)
π§
Keyword Pioneer
π£
Hot Topic Early Bird
π
Conference Loyalist
(23)
π
Keyword Champion
(2)
π§¬
Topic Evolution
π€
Dynamic Duo
(17)
π
Conference Pioneer
π
Century Club
(57)
π₯
Unstoppable
(11)
ποΈ
Keyword Collector
(223)
π
Trend Setter
β
The Questioner
β‘
Prolific Year
(10)
Conferences
CVPR (23)
ICCV (14)
ECCV (11)
ICLR (4)
AAAI (2)
AACL (1)
IJCNLP (1)
NIPS (1)
WACV (1)
Top co-authors
Keywords
semantic segmentation
(8)
object detection
(5)
instance segmentation
(4)
video object segmentation
(4)
video understanding
(3)
person re-identification
(3)
neural network
(3)
3d object retrieval
(3)
diffusion model
(3)
contrastive learning
(2)
prompt engineering
(2)
instruction following
(2)
metric learning
(2)
object tracking
(2)
transfer learning
(2)
representation learning
(2)
zero-shot learning
(2)
video generation
(2)
model compression
(2)
action recognition
(2)
Papers
Learning to Animate Images from A Few Videos to Portray Delicate Human Actions
WACV 2026
TimeExpert: An Expert-Guided Video LLM for Video Temporal Grounding
ICCV 2025
Describe, Adapt and Combine: Empowering CLIP Encoders for Open-set 3D Object Retrieval
ICCV 2025
Structured Outputs in Prompt Engineering: Enhancing LLM Adaptability on Counterintuitive Instructions
IJCNLP 2025
Structured Outputs in Prompt Engineering: Enhancing LLM Adaptability on Counterintuitive Instructions
AACL 2025
Versatile Transition Generation with Image-to-Video Diffusion
ICCV 2025
DragDiffusion: Harnessing Diffusion Models for Interactive Point-based Image Editing
CVPR 2024
Discovering Failure Modes of Text-guided Diffusion Models via Adversarial Search
ICLR 2024
PartGLEE: A Foundation Model for Recognizing and Parsing Any Objects
ECCV 2024
Free-ATM: Harnessing Free Attention Masks for Representation Learning on Diffusion-Generated Images
ECCV 2024
DIRECT-3D: Learning Direct Text-to-3D Generation on Massive Noisy 3D Data
CVPR 2024
General Object Foundation Model for Images and Videos at Scale
CVPR 2024
InstMove: Instance Motion for Object-Centric Video Segmentation
CVPR 2023
PLA: Language-Driven Open-Vocabulary 3D Scene Understanding
CVPR 2023
Mixed Samples as Probes for Unsupervised Model Selection in Domain Adaptation
NIPS 2023
MOSE: A New Dataset for Video Object Segmentation in Complex Scenes
ICCV 2023
SRFormer: Permuted Self-Attention for Single Image Super-Resolution
ICCV 2023
PV3D: A 3D Generative Model for Portrait Video Generation
ICLR 2023
Towards Understanding and Mitigating Dimensional Collapse in Heterogeneous Federated Learning
ICLR 2023
IS SYNTHETIC DATA FROM GENERATIVE MODELS READY FOR IMAGE RECOGNITION?
ICLR 2023
DanceTrack: Multi-Object Tracking in Uniform Appearance and Diverse Motion
CVPR 2022
Explicit Occlusion Reasoning for Multi-Person 3D Human Pose Estimation
ECCV 2022
Language Matters: A Weakly Supervised Vision-Language Pre-training Approach for Scene Text Detection and Spotting
ECCV 2022
Contextual Text Block Detection towards Scene Text Understanding
ECCV 2022
SeqFormer: Sequential Transformer for Video Instance Segmentation
ECCV 2022
In Defense of Online Models for Video Instance Segmentation
ECCV 2022
Mimicking the Oracle: An Initial Phase Decorrelation Approach for Class Incremental Learning
CVPR 2022
An Empirical Study of End-to-End Temporal Action Detection
CVPR 2022
Knowledge Distillation As Efficient Pre-Training: Faster Convergence, Higher Data-Efficiency, and Better Transferability
CVPR 2022
Fourier Document Restoration for Robust Document Dewarping and Recognition
CVPR 2022
TransMix: Attend To Mix for Vision Transformers
CVPR 2022
YouMVOS: An Actor-Centric Multi-Shot Video Object Segmentation Dataset
CVPR 2022
Multi-Shot Temporal Event Localization: A Benchmark
CVPR 2021
PlaneTR: Structure-Guided Transformers for 3D Plane Recovery
ICCV 2021
SwiftNet: Real-Time Video Object Segmentation
CVPR 2021
Anchor-Free Person Search
CVPR 2021
Holistically-Attracted Wireframe Parsing
CVPR 2020
Neural Architecture Search for Lightweight Non-Local Networks
CVPR 2020
Regional Homogeneity: Towards Learning Transferable Universal Adversarial Perturbations Against Defenses
ECCV 2020
Importance-Aware Semantic Segmentation in Self-Driving with Discrete Wasserstein Training
AAAI 2020
XingGAN for Person Image Generation
ECCV 2020
Learning Transferable Adversarial Examples via Ghost Networks
AAAI 2020
Corner Proposal Network for Anchor-free, Two-stage Object Detection
ECCV 2020
Learning Attraction Field Representation for Robust Line Segment Detection
CVPR 2019
Improving Transferability of Adversarial Examples With Input Diversity
CVPR 2019
Asymmetric Non-Local Neural Networks for Semantic Segmentation
ICCV 2019
Anchor Diffusion for Unsupervised Video Object Segmentation
ICCV 2019
CenterNet: Keypoint Triplets for Object Detection
ICCV 2019
View N-Gram Network for 3D Object Retrieval
ICCV 2019
Learn to Scale: Generating Multipolar Normalized Density Maps for Crowd Counting
ICCV 2019
Symmetry-Constrained Rectification Network for Scene Text Recognition
ICCV 2019
Prior-Aware Neural Network for Partially-Supervised Multi-Organ Segmentation
ICCV 2019
Re-Ranking via Metric Fusion for Object Retrieval and Person Re-Identification
CVPR 2019
Triplet-Center Loss for Multi-View 3D Object Retrieval
CVPR 2018
Hard-Aware Point-to-Set Deep Metric for Person Re-identification
ECCV 2018
Ensemble Diffusion for Retrieval
ICCV 2017
Scalable Person Re-Identification on Supervised Smoothed Manifold
CVPR 2017
GIFT: A Real-Time and Scalable 3D Shape Search Engine
CVPR 2016