Jianfei Cai
105 papers · 2014–2026 · 11 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+15 more ↓ Show less ↑
π§ Keyword Pioneer π Conference Polyglot (11) πΊοΈ Taxonomy Completionist (14) π Interdisciplinary Bridge π Academic Marathon (11)
π
Academic Marathon
(11)
π
Interdisciplinary Bridge
πΊοΈ
Taxonomy Completionist
(14)
π
Conference Loyalist
(35)
π€
Dynamic Duo
(17)
π
Triple Crown
π
Grand Slam
π¬
Deep Specialist
(14)
π₯
Unstoppable
(12)
π
Conference Pioneer
β‘
Prolific Year
(20)
β
The Questioner
(2)
ποΈ
Keyword Collector
(352)
π
Trend Setter
π
Century Club
(101)
Conferences
CVPR (35)
ECCV (23)
ICCV (16)
NIPS (8)
AAAI (5)
ICLR (5)
IJCAI (4)
ICML (3)
MICCAI (3)
WACV (2)
AISTATS (1)
Top co-authors
Keywords
attention mechanism
(7)
image generation
(6)
object detection
(6)
vision transformer
(5)
point cloud
(4)
semantic segmentation
(4)
image classification
(4)
convolutional neural network
(4)
representation learning
(4)
neural network
(4)
zero-shot learning
(3)
self-supervised learning
(3)
unsupervised learning
(3)
image captioning
(3)
depth estimation
(3)
3d reconstruction
(3)
domain adaptation
(3)
bayesian inference
(3)
transfer learning
(3)
model compression
(3)
Papers
Marginalized Generalized IoU (MGIoU): A Unified Objective Function for Optimizing Convex Parametric Shapes
AAAI 2026
PanFlow: Decoupled Motion Control for Panoramic Video Generation
AAAI 2026
PCGS: Progressive Compression of 3D Gaussian Splatting
AAAI 2026
Where and What Matters: Sensitivity-Aware Task Vectors for Many-Shot Multimodal In-Context Learning
AAAI 2026
DrVideo: Document Retrieval Based Long Video Understanding
CVPR 2025
McCaD: Multi-Contrast MRI Conditioned Adaptive Adversarial Diffusion Model for High-Fidelity MRI Synthesis
WACV 2025
New Multiple Sclerosis Lesion Segmentation via Calibrated Inter-patch Blending
MICCAI 2025
FPN-in-FPN: A Nested Multi-Scale Aggregation Network for Polyp Segmentation
MICCAI 2025
Fast Feedforward 3D Gaussian Splatting Compression
ICLR 2025
T-Stitch: Accelerating Sampling in Pre-Trained Diffusion Models with Trajectory Stitching
ICLR 2025
PaRa: Personalizing Text-to-Image Diffusion via Parameter Rank Reduction
ICLR 2025
VLIPP: Towards Physically Plausible Video Generation with Vision and Language Informed Physical Prior
ICCV 2025
Point-Cache: Test-time Dynamic and Hierarchical Cache for Robust and Generalizable Point Cloud Analysis
CVPR 2025
PanSplat: 4K Panorama Synthesis with Feed-Forward Gaussian Splatting
CVPR 2025
Sharpness-Aware Data Generation for Zero-shot Quantization
ICML 2024
Normal-GS: 3D Gaussian Splatting with Normal-Involved Rendering
NIPS 2024
GMAI-MMBench: A Comprehensive Multimodal Evaluation Benchmark Towards General Medical AI
NIPS 2024
MVSplat360: Feed-Forward 360 Scene Synthesis from Sparse Views
NIPS 2024
Point-PRC: A Prompt Learning Based Regulation Framework for Generalizable Point Cloud Analysis
NIPS 2024
How Far Can We Compress Instant-NGP-Based NeRF?
CVPR 2024
Diversified and Personalized Multi-rater Medical Image Segmentation
CVPR 2024
Taming Stable Diffusion for Text to 360 Panorama Image Generation
CVPR 2024
JRDB-PanoTrack: An Open-world Panoptic Segmentation and Tracking Robotic Dataset in Crowded Human Environments
CVPR 2024
Efficient Stitchable Task Adaptation
CVPR 2024
Generative Region-Language Pretraining for Open-Ended Object Detection
CVPR 2024
Surface Reconstruction for 3D Gaussian Splatting via Local Structural Hints
ECCV 2024
HAC: Hash-grid Assisted Context for 3D Gaussian Splatting Compression
ECCV 2024
Differentiable Convex Polyhedra Optimization from Multi-view Images
ECCV 2024
MVSplat: Efficient 3D Gaussian Splatting from Sparse Multi-View Images
ECCV 2024
Stitched ViTs are Flexible Vision Backbones
ECCV 2024
McGrids: Monte Carlo-Driven Adaptive Grids for Iso-Surface Extraction
ECCV 2024
Diffusion Model for Robust Multi-Sensor Fusion in 3D Object Detection and BEV Segmentation
ECCV 2024
QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Models
ICLR 2024
SAM-Med3D-MoE: Towards a Non-Forgetting Segment Anything Model via Mixture of Experts for 3D Medical Image Segmentation
MICCAI 2024
ObjectSDF++: Improved Object-Compositional Neural Implicit Surfaces
ICCV 2023
Dynamic Focus-Aware Positional Queries for Semantic Segmentation
CVPR 2023
JRDB-Pose: A Large-Scale Dataset for Multi-Person Pose Estimation and Tracking
CVPR 2023
Vector Quantized Wasserstein Auto-Encoder
ICML 2023
Adversarial Local Distribution Regularization for Knowledge Distillation
WACV 2023
Transformer Scale Gate for Semantic Segmentation
CVPR 2023
Stitchable Neural Networks
CVPR 2023
MARLIN: Masked Autoencoder for Facial Video Representation LearnINg
CVPR 2023
Learning Object-Language Alignments for Open-Vocabulary Object Detection
ICLR 2023
Sensitivity-Aware Visual Parameter-Efficient Fine-Tuning
ICCV 2023
Dual Adaptive Transformations for Weakly Supervised Point Cloud Segmentation
ECCV 2022
Object-Compositional Neural Implicit Surfaces
ECCV 2022
Sem2NeRF: Converting Single-View Semantic Masks to Neural Radiance Fields
ECCV 2022
ExtrudeNet: Unsupervised Inverse Sketch-and-Extrude for Shape Parsing
ECCV 2022
ProposalCLIP: Unsupervised Open-Category Object Proposal Generation via Exploiting CLIP Cues
CVPR 2022
Bridging Global Context Interactions for High-Fidelity Image Completion
CVPR 2022
GMFlow: Learning Optical Flow via Global Matching
CVPR 2022
Fast Vision Transformers with HiLo Attention
NIPS 2022
MoVQ: Modulating Quantized Vectors for High-Fidelity Image Generation
NIPS 2022
Less Is More: Pay Less Attention in Vision Transformers
AAAI 2022
Particle-based Adversarial Local Distribution Regularization
AISTATS 2022
EcoFormer: Energy-Saving Attention with Linear Complexity
NIPS 2022
Multimodal Transformer with Variable-Length Memory for Vision-and-Language Navigation
ECCV 2022
The Spatially-Correlative Loss for Various Image Translation Tasks
CVPR 2021
CSG-Stump: A Learning Friendly CSG-Like Representation for Interpretable Shape Parsing
ICCV 2021
Domain-Invariant Disentangled Network for Generalizable Object Detection
ICCV 2021
High-Resolution Optical Flow From 1D Attention and Correlation
ICCV 2021
Learning Meta-Class Memory for Few-Shot Semantic Segmentation
ICCV 2021
A Unified 3D Human Motion Synthesis Model via Conditional Variational Auto-Encoder
ICCV 2021
Scalable Vision Transformers With Hierarchical Pooling
ICCV 2021
Auto-Parsing Network for Image Captioning and Visual Question Answering
ICCV 2021
RSG: A Simple but Effective Module for Learning Imbalanced Datasets
CVPR 2021
Causal Attention for Vision-Language Tasks
CVPR 2021
Exploring Bottom-Up and Top-Down Cues With Attentive Learning for Webly Supervised Object Detection
CVPR 2020
Learning Progressive Joint Propagation for Human Motion Prediction
ECCV 2020
Splitting vs. Merging: Mining Object Regions with Discrepancy and Intersection Loss for Weakly Supervised Semantic Segmentation
ECCV 2020
Self-Supervised Relationship Probing
NIPS 2020
Finding It at Another Side: A Viewpoint-Adapted Matching Encoder for Change Captioning
ECCV 2020
Learning from the Scene and Borrowing from the Rich: Tackling the Long Tail in Scene Graph Generation
IJCAI 2020
End-to-End 3D Point Cloud Instance Segmentation Without Detection
CVPR 2020
Learning to Collocate Neural Modules for Image Captioning
ICCV 2019
Exploiting Spatial-Temporal Relationships for 3D Pose Estimation via Graph Convolutional Networks
ICCV 2019
Region Deformer Networks for Unsupervised Depth Estimation from Unconstrained Monocular Videos
IJCAI 2019
Skeleton-Aware 3D Human Shape Reconstruction From Point Clouds
ICCV 2019
3D Hand Shape and Pose Estimation From a Single RGB Image
CVPR 2019
Auto-Encoding Scene Graphs for Image Captioning
CVPR 2019
Scene Graph Generation With External Knowledge and Image Reconstruction
CVPR 2019
Pluralistic Image Completion
CVPR 2019
Unpaired Image Captioning via Scene Graph Alignments
ICCV 2019
Unpaired Image Captioning by Language Pivoting
ECCV 2018
Look, Imagine and Match: Improving Textual-Visual Cross-Modal Retrieval With Generative Models
CVPR 2018
Alive Caricature From 2D to 3D
CVPR 2018
T2Net: Synthetic-to-Realistic Translation for Solving Single-Image Depth Estimation Tasks
ECCV 2018
Zero-Annotation Object Detection with Web Knowledge Transfer
ECCV 2018
VQA-E: Explaining, Elaborating, and Enhancing Your Answers for Visual Questions
ECCV 2018
Generalized Robust Bayesian Committee Machine for Large-scale Gaussian Process Regression
ICML 2018
Quadtree Convolutional Neural Networks
ECCV 2018
Deep Adaptive Attention for Joint Facial Action Unit Detection and Face Alignment
ECCV 2018
Weakly-supervised 3D Hand Pose Estimation from Monocular RGB Images
ECCV 2018
Shuffle-Then-Assemble: Learning Object-Agnostic Visual Relationship Features
ECCV 2018
A Generative Model for Depth-Based Robust 3D Facial Pose Tracking
CVPR 2017
Robust Survey Aggregation with Student-t Distribution and Sparse Representation
IJCAI 2017
Student-t Process Regression with Student-t Likelihood
IJCAI 2017
An Empirical Study of Language CNN for Image Captioning
ICCV 2017
MIML-FCN+: Multi-Instance Multi-Label Learning via Fully Convolutional Networks With Privileged Information
CVPR 2017
Object Co-Skeletonization With Co-Segmentation
CVPR 2017
Exploit Bounding Box Annotations for Multi-Label Object Recognition
CVPR 2016
Modality and Component Aware Feature Fusion For RGB-D Scene Classification
CVPR 2016
MMSS: Multi-Modal Sharable and Specific Feature Learning for RGB-D Object Recognition
ICCV 2015
Recovering Surface Details under General Unknown Illumination Using Shading and Coarse Multi-view Stereo
CVPR 2014
Compact Representation for Image Classification: To Choose or to Compress?
CVPR 2014