Jianfeng Wang
48 papers · 2017–2026 · 11 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+14 more ↓ Show less ↑
π Conference Polyglot (11) π Academic Marathon (9) π§ Keyword Pioneer π Interdisciplinary Bridge π Cross-Pollinator (14)
π
Cross-Pollinator
(14)
π
Renaissance Researcher
(7)
πΊοΈ
Taxonomy Completionist
(74)
π
Keyword Champion
(2)
π
Triple Crown
π¬
Deep Specialist
(11)
π€
Dynamic Duo
(36)
π
Grand Slam
π
Century Club
(48)
β‘
Prolific Year
(10)
π₯
Unstoppable
(8)
π
Trend Setter
π
Conference Pioneer
ποΈ
Keyword Collector
(175)
Conferences
CVPR (16)
ICLR (7)
ECCV (6)
NIPS (6)
ICML (3)
AAAI (2)
ICCV (2)
IJCAI (2)
WACV (2)
ACL (1)
EMNLP (1)
Top co-authors
Keywords
multimodal learning
(9)
object detection
(7)
image captioning
(5)
semi-supervised learning
(4)
image segmentation
(4)
zero-shot learning
(4)
vision-language model
(4)
video generation
(4)
diffusion model
(4)
semantic segmentation
(3)
transfer learning
(3)
image classification
(3)
visual question answering
(3)
image generation
(2)
convolutional neural network
(2)
few-shot learning
(2)
weak supervision
(2)
open-vocabulary segmentation
(2)
in-context learning
(2)
autoregressive generation
(2)
Papers
Zero-Shot Audio-Visual Editing via Cross-Modal Delta Denoising
WACV 2026
EditRoom: LLM-parameterized Graph Diffusion for Composable 3D Room Layout Editing
ICLR 2025
SlowFast-VGen: Slow-Fast Learning for Action-Driven Long Video Generation
ICLR 2025
GenXD: Generating Any 3D and 4D Scenes
ICLR 2025
MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos
ICLR 2025
LiVOS: Light Video Object Segmentation with Gated Linear Matching
CVPR 2025
MMSum: A Dataset for Multimodal Summarization and Thumbnail Generation of Videos
CVPR 2024
Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning
ICLR 2024
GRiT: A Generative Region-to-text Transformer for Object Understanding
ECCV 2024
Idea2Img: Iterative Self-Refinement with GPT-4V for Automatic Image Design and Generation
ECCV 2024
IDOL: Unified Dual-Modal Latent Diffusion for Human-Centric Joint Video-Depth Generation
ECCV 2024
Bring Metric Functions into Diffusion Models
IJCAI 2024
Interfacing Foundation Models' Embeddings
NIPS 2024
Motion Consistency Model: Accelerating Video Diffusion with Disentangled Motion-Appearance Distillation
NIPS 2024
MM-Narrator: Narrating Long-form Videos with Multimodal In-Context Learning
CVPR 2024
Segment and Caption Anything
CVPR 2024
MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities
ICML 2024
Prompting GPT-3 To Be Reliable
ICLR 2023
Segment Everything Everywhere All at Once
NIPS 2023
NUWA-XL: Diffusion over Diffusion for eXtremely Long Video Generation
ACL 2023
ReCo: Region-Controlled Text-to-Image Generation
CVPR 2023
Generalized Decoding for Pixel, Image, and Language
CVPR 2023
Detection Hub: Unifying Object Detection Datasets via Query Adaptation on Language Embedding
CVPR 2023
NP-SemiSeg: When Neural Processes meet Semi-Supervised Semantic Segmentation
ICML 2023
Learning 3D Photography Videos via Self-supervised Diffusion on Single Images
IJCAI 2023
Rethinking Bayesian Deep Learning Methods for Semi-Supervised Volumetric Medical Image Segmentation
CVPR 2022
Scaling Up Vision-Language Pre-Training for Image Captioning
CVPR 2022
UniTAB: Unifying Text and Box Outputs for Grounded Vision-Language Modeling
ECCV 2022
Injecting Semantic Concepts Into End-to-End Image Captioning
CVPR 2022
Coarse-to-Fine Vision-Language Pre-training with Fusion in the Backbone
NIPS 2022
An Empirical Study of GPT-3 for Few-Shot Knowledge-Based VQA
AAAI 2022
NP-Match: When Neural Processes meet Semi-Supervised Learning
ICML 2022
NUWA-Infinity: Autoregressive over Autoregressive Generation for Infinite Visual Synthesis
NIPS 2022
An Empirical Study of Training End-to-End Vision-and-Language Transformers
CVPR 2022
"A Simple Approach and Benchmark for 21,000-Category Object Detection"
ECCV 2022
TAP: Text-Aware Pre-Training for Text-VQA and Text-Caption
CVPR 2021
DAP: Detection-Aware Pre-Training With Weak Supervision
CVPR 2021
NICE: Neural Image Commenting with Empathy
EMNLP 2021
Compressing Visual-Linguistic Model via Knowledge Distillation
ICCV 2021
End-to-End Semi-Supervised Object Detection With Soft Teacher
ICCV 2021
SEED: Self-supervised Distillation For Visual Representation
ICLR 2021
RSG: A Simple but Effective Module for Learning Imbalanced Datasets
CVPR 2021
End-to-End Object Detection With Fully Convolutional Network
CVPR 2021
Anchor Box Optimization for Object Detection
WACV 2020
Label Distribution Learning on Auxiliary Label Space Graphs for Facial Expression Recognition
CVPR 2020
Boosting Weakly Supervised Object Detection with Progressive Knowledge Transfer
ECCV 2020
Hierarchically Structured Reinforcement Learning for Topically Coherent Visual Story Generation
AAAI 2019
Gated Recurrent Convolution Neural Network for OCR
NIPS 2017