conftrace_

Ming-Hsuan Yang

285 papers · 2006–2026 · 13 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓
+19 more ↓ 🐣 Hot Topic Early Bird 🌍 Conference Polyglot (13) πŸŒ‰ Interdisciplinary Bridge 🧭 Keyword Pioneer πŸƒ Academic Marathon (19)
πŸƒ Academic Marathon (19) πŸŒ‰ Interdisciplinary Bridge πŸ—ΊοΈ Taxonomy Completionist (175) 🏠 Conference Loyalist (27) 🌟 Keyword Trendsetter Combo (14) πŸ† Grand Slam πŸ‘‘ Triple Crown 🌱 Topic Pioneer πŸ† Keyword Champion 🀝 Dynamic Duo (23) πŸ‘₯ Mega-Team (31) πŸ”¬ Deep Specialist (40) πŸ’Ž Century Club (284) πŸš€ Conference Pioneer ❓ The Questioner πŸ“ˆ Trend Setter ⚑ Prolific Year (37) πŸ”₯ Unstoppable (13) πŸ—ƒοΈ Keyword Collector (892)

Conferences

CVPR (114) ICCV (50) ECCV (39) NIPS (27) ICLR (20) WACV (15) ICML (7) AAAI (6) EMNLP (3) ACL (1) AISTATS (1) MICCAI (1) UAI (1)

Papers

Tracking the Unstable: Appearance-Guided Motion Modeling for Robust Multi-Object Tracking in UAV-Captured Videos AAAI 2026 PocoLoco: A Point Cloud Diffusion Model of Human Shape in Loose Clothing WACV 2025 UniRestore: Unified Perceptual and Task-Oriented Image Restoration Model Using Diffusion Prior CVPR 2025 AutoOcc: Automatic Open-Ended Semantic Occupancy Annotation via Vision-Language Guided Gaussian Splatting ICCV 2025 Frequency Domain-Based Diffusion Model for Unpaired Image Dehazing ICCV 2025 From Prompt to Progression: Taming Video Diffusion Models for Seamless Attribute Transition ICCV 2025 MeshLLM: Empowering Large Language Models to Progressively Understand and Generate 3D Mesh ICCV 2025 FaceLift: Learning Generalizable Single Image 3D Face Reconstruction from Synthetic Heads ICCV 2025 QK-Edit: Revisiting Attention-based Injection in MM-DiT for Image and Video Editing ICCV 2025 Toward Material-Agnostic System Identification from Videos ICCV 2025 Controllable 3D Outdoor Scene Generation via Scene Graphs ICCV 2025 Learning Deblurring Texture Prior from Unpaired Data with Diffusion Model ICCV 2025 CompleteMe: Reference-based Human Image Completion ICCV 2025 Efficient Concertormer for Image Deblurring and Beyond ICCV 2025 MRFD: Multi-Region Fusion Decoding with Self-Consistency for Mitigating Hallucinations in LVLMs EMNLP 2025 Multi-subject Open-set Personalization in Video Generation CVPR 2025 Unified Dense Prediction of Video Diffusion CVPR 2025 Distilling Spectral Graph for Object-Context Aware Open-Vocabulary Semantic Segmentation CVPR 2025 Move-in-2D: 2D-Conditioned Human Motion Generation CVPR 2025 Efficient Visual State Space Model for Image Deblurring CVPR 2025 Calibrated Multi-Preference Optimization for Aligning Diffusion Models CVPR 2025 Cropper: Vision-Language Model for Image Cropping through In-Context Learning CVPR 2025 DiMo-GUI: Advancing Test-time Scaling in GUI Grounding via Modality-Aware Visual Reasoning EMNLP 2025 DreaMo: Articulated 3D Reconstruction from a Single Casual Video WACV 2025 DynamicScaler: Seamless and Scalable Video Generation for Panoramic Scenes CVPR 2025 TESLA: Test-time Reference-free Through-plane Super-resolution for Multi-contrast Brain MRI MICCAI 2025 Ranking-aware adapter for text-driven image ordering with CLIP ICLR 2025 Learning Spatial-Semantic Features for Robust Video Object Segmentation ICLR 2025 Three-Dimensional Trajectory Prediction with 3DMoTraj Dataset ICML 2025 Fine-Grained Controllable Video Generation via Object Appearance and Context WACV 2025 Generating Synthetic Data for Unsupervised Federated Learning of Cross-Modal Retrieval AAAI 2025 Generating Long-Take Videos via Effective Keyframes and Guidance WACV 2025 MonST3R: A Simple Approach for Estimating Geometry in the Presence of Motion ICLR 2025 Layout-your-3D: Controllable and Precise 3D Generation with 2D Blueprint ICLR 2025 RobuRCDet: Enhancing Robustness of Radar-Camera Fusion in Bird's Eye View for 3D Object Detection ICLR 2025 No Pose, No Problem: Surprisingly Simple 3D Gaussian Splats from Sparse Unposed Images ICLR 2025 HQGS: High-Quality Novel View Synthesis with Gaussian Splatting in Degraded Scenes ICLR 2025 A Simple Approach to Unifying Diffusion-based Conditional Generation ICLR 2025 OmnixR: Evaluating Omni-modality Language Models on Reasoning across Modalities ICLR 2025 Efficient Video Object Segmentation via Modulated Cross-Attention Memory WACV 2025 RMP-SAM: Towards Real-Time Multi-Purpose Segment Anything ICLR 2025 Telling Left from Right: Identifying Geometry-Aware Semantic Correspondence CVPR 2024 Beyond SOT: Tracking Multiple Generic Objects at Once WACV 2024 GALA3D: Towards Text-to-3D Complex Scene Generation via Layout-guided Generative Gaussian Splatting ICML 2024 VideoPrism: A Foundational Visual Encoder for Video Understanding ICML 2024 VinT-6D: A Large-Scale Object-in-hand Dataset from Vision, Touch and Proprioception ICML 2024 VideoPoet: A Large Language Model for Zero-Shot Video Generation ICML 2024 Sharing Key Semantics in Transformer Makes Efficient Image Restoration NIPS 2024 Extending Video Masked Autoencoders to 128 frames NIPS 2024 SemFlow: Binding Semantic Segmentation and Image Synthesis via Rectified Flow NIPS 2024 BEV-MAE: Bird’s Eye View Masked Autoencoders for Point Cloud Pre-training in Autonomous Driving Scenarios AAAI 2024 CSL: Class-Agnostic Structure-Constrained Learning for Segmentation Including the Unseen AAAI 2024 StyleDubber: Towards Multi-Scale Style Learning for Movie Dubbing ACL 2024 Structured Video-Language Modeling with Temporal Grouping and Spatial Grounding ICLR 2024 Dual Associated Encoder for Face Restoration ICLR 2024 Language Model Beats Diffusion - Tokenizer is key to visual generation ICLR 2024 Personalized Video Comment Generation EMNLP 2024 Pyramid Diffusion for Fine 3D Large Scene Generation ECCV 2024 Spatial-Temporal Multi-level Association for Video Object Segmentation ECCV 2024 HENet: Hybrid Encoding for End-to-end Multi-task 3D Perception from Multi-view Cameras ECCV 2024 Chat-Edit-3D: Interactive 3D Scene Editing via Text Prompts ECCV 2024 Improving Point-based Crowd Counting and Localization Based on Auxiliary Point Guidance ECCV 2024 Taming Latent Diffusion Model for Neural Radiance Field Inpainting ECCV 2024 Weakly Supervised 3D Object Detection via Multi-Level Visual Guidance ECCV 2024 Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers CVPR 2024 UniGS: Unified Representation for Image Generation and Segmentation CVPR 2024 Weakly Supervised Video Individual Counting CVPR 2024 RTracker: Recoverable Tracking via PN Tree Structured Memory CVPR 2024 PTT: Point-Trajectory Transformer for Efficient Temporal 3D Object Detection CVPR 2024 DrivingGaussian: Composite Gaussian Splatting for Surrounding Dynamic Autonomous Driving Scenes CVPR 2024 Exploiting Diffusion Prior for Generalizable Dense Prediction CVPR 2024 VidToMe: Video Token Merging for Zero-Shot Video Editing CVPR 2024 GLaMM: Pixel Grounding Large Multimodal Model CVPR 2024 No More Ambiguity in 360deg Room Layout via Bi-Layout Estimation CVPR 2024 Text-Driven Image Editing via Learnable Regions CVPR 2024 Improving Subject-Driven Image Synthesis with Subject-Agnostic Guidance CVPR 2024 Motion-adaptive Separable Collaborative Filters for Blind Motion Deblurring CVPR 2024 VideoGrounding-DINO: Towards Open-Vocabulary Spatio-Temporal Video Grounding CVPR 2024 Diffusion-SS3D: Diffusion Model for Semi-supervised 3D Object Detection NIPS 2023 ARTIC3D: Learning Robust Articulated 3D Shapes from Noisy Web Image Collections NIPS 2023 A Tale of Two Features: Stable Diffusion Complements DINO for Zero-Shot Semantic Correspondence NIPS 2023 Video Timeline Modeling For News Story Understanding NIPS 2023 AIMS: All-Inclusive Multi-Level Segmentation for Anything NIPS 2023 Muse: Text-To-Image Generation via Masked Generative Transformers ICML 2023 Unveiling The Mask of Position-Information Pattern Through the Mist of Image Features ICML 2023 High Quality Entity Segmentation ICCV 2023 CiteTracker: Correlating Image and Text for Visual Tracking ICCV 2023 InfiniCity: Infinite-Scale City Synthesis ICCV 2023 SwiftFormer: Efficient Additive Attention for Transformer-based Real-time Mobile Vision Applications ICCV 2023 Generative Multiplane Neural Radiance for 3D-Aware Image Generation ICCV 2023 MiniROAD: Minimal RNN Framework for Online Action Detection ICCV 2023 CLR: Channel-wise Lightweight Reprogramming for Continual Learning ICCV 2023 Self-regulating Prompts: Foundational Model Adaptation without Forgetting ICCV 2023 Delving into Motion-Aware Matching for Monocular 3D Object Tracking ICCV 2023 SAMPLING: Scene-adaptive Hierarchical Multiplane Images Representation for Novel View Synthesis from a Single Image ICCV 2023 Unified Visual Relationship Detection with Vision and Language Models ICCV 2023 Counting Crowds in Bad Weather ICCV 2023 Module-wise Adaptive Distillation for Multimodality Foundation Models NIPS 2023 Exploiting Completeness and Uncertainty of Pseudo Labels for Weakly Supervised Video Anomaly Detection CVPR 2023 Burstormer: Burst Image Restoration and Enhancement Transformer CVPR 2023 Self-Supervised Super-Plane for Neural 3D Reconstruction CVPR 2023 Hi-LASSIE: High-Fidelity Articulated Shape and Skeleton Discovery From Sparse Image Ensemble CVPR 2023 Self-Supervised AutoFlow CVPR 2023 Learning To Dub Movies via Hierarchical Prosody Models CVPR 2023 Improving Zero-Shot Generalization and Robustness of Multi-Modal Models CVPR 2023 MAGVIT: Masked Generative Video Transformer CVPR 2023 SPAE: Semantic Pyramid AutoEncoder for Multimodal Generation with Frozen LLMs NIPS 2023 Learning Discriminative Shrinkage Deep Networks for Image Deconvolution ECCV 2022 LASSIE: Learning Articulated Shapes from Sparse Image Ensemble via 3D Part Discovery NIPS 2022 Contextualized Spatio-Temporal Contrastive Learning With Self-Supervision CVPR 2022 Video Frame Interpolation Transformer CVPR 2022 Burst Image Restoration and Enhancement CVPR 2022 Restormer: Efficient Transformer for High-Resolution Image Restoration CVPR 2022 Hierarchical Modular Network for Video Captioning CVPR 2022 InOut: Diverse Image Outpainting via GAN Inversion CVPR 2022 Learning Visibility for Robust Dense Human Body Estimation ECCV 2022 Autoregressive 3D Shape Generation via Canonical Mapping ECCV 2022 Class-Agnostic Object Detection with Multi-modal Transformer ECCV 2022 Adaptive Transformers for Robust Few-Shot Cross-Domain Face Anti-Spoofing ECCV 2022 Scraping Textures from Natural Images for Synthesis and Editing ECCV 2022 CA-SSL: Class-Agnostic Semi-Supervised Learning for Detection and Segmentation ECCV 2022 V2X-ViT: Vehicle-to-Everything Cooperative Perception with Vision Transformer ECCV 2022 Learning Continuous Environment Fields via Implicit Functions ICLR 2022 InfinityGAN: Towards Infinite-Pixel Image Synthesis ICLR 2022 ViDT: An Efficient and Effective Fully Transformer-based Object Detector ICLR 2022 Incremental False Negative Detection for Contrastive Learning ICLR 2022 Federated Multi-Target Domain Adaptation WACV 2022 Video Salient Object Detection via Contrastive Features and Attention Modules WACV 2022 Semi-Supervised Multi-Task Learning for Semantics and Depth WACV 2022 Learning 3D Dense Correspondence via Canonical Point Autoencoder NIPS 2021 Exploring Cross-Video and Cross-Modality Signals for Weakly-Supervised Audio-Visual Video Parsing NIPS 2021 Intriguing Properties of Vision Transformers NIPS 2021 End-to-end Multi-modal Video Temporal Grounding NIPS 2021 Discovering 3D Parts From Image Collections ICCV 2021 Semi-Supervised Learning with Meta-Gradient AISTATS 2021 Hybrid Neural Fusion for Full-Frame Video Stabilization ICCV 2021 COMISR: Compression-Informed Video Super-Resolution ICCV 2021 Learning To Stylize Novel Views ICCV 2021 The Road To Know-Where: An Object-and-Room Informed Sequential BERT for Indoor Vision-Language Navigation ICCV 2021 ReMix: Towards Image-to-Image Translation With Limited Data CVPR 2021 Regularizing Generative Adversarial Networks Under Limited Data CVPR 2021 Decoupled Dynamic Filter Networks CVPR 2021 Spatiotemporal Contrastive Video Representation Learning CVPR 2021 D2-Net: Weakly-Supervised Action Localization via Discriminative Embeddings and Denoised Activations ICCV 2021 Video Matting via Consistency-Regularized Graph Neural Networks ICCV 2021 Structured sparsification with joint optimization of group convolution and channel shuffle UAI 2021 Multi-Stage Progressive Image Restoration CVPR 2021 Benchmarking Ultra-High-Definition Image Super-Resolution ICCV 2021 Controllable and Progressive Image Extrapolation WACV 2021 Multi-Path Neural Networks for On-Device Multi-Domain Visual Classification WACV 2021 Modeling Artistic Workflows for Image Generation and Editing ECCV 2020 Generalized Convolutional Forest Networks for Domain Generalization and Visual Recognition ICLR 2020 Learning Enriched Features for Real Image Restoration and Enhancement ECCV 2020 Image Hashing via Linear Discriminant Learning WACV 2020 Fast Video Multi-Style Transfer WACV 2020 Adversarial Training with Bi-directional Likelihood Regularization for Visual Classification ECCV 2020 Single-Image HDR Reconstruction by Learning to Reverse the Camera Pipeline CVPR 2020 Composing Good Shots by Exploiting Mutual Relations CVPR 2020 CycleISP: Real Image Restoration via Improved Data Synthesis CVPR 2020 Multi-Scale Boosted Dehazing Network With Dense Feature Fusion CVPR 2020 Collaborative Distillation for Ultra-Resolution Universal Style Transfer CVPR 2020 Rethinking Class-Balanced Methods for Long-Tailed Visual Recognition From a Domain Adaptation Perspective CVPR 2020 Weakly-Supervised Semantic Segmentation via Sub-Category Exploration CVPR 2020 Learning to See Through Obstructions CVPR 2020 Visual Question Answering on 360deg Images WACV 2020 Progressive Domain Adaptation for Object Detection WACV 2020 Online Adaptation for Consistent Mesh Reconstruction in the Wild NIPS 2020 Adversarial Learning of Privacy-Preserving and Task-Oriented Representations AAAI 2020 Cross-Domain Few-Shot Classification via Learned Feature-Wise Transformation ICLR 2020 Neural Design Network: Graphic Layout Generation with Constraints ECCV 2020 Controllable Image Synthesis via SegVAE ECCV 2020 RetrieveGAN: Image Synthesis via Differentiable Patch Retrieval ECCV 2020 Learnable Cost Volume Using the Cayley Representation ECCV 2020 Every Pixel Matters: Center-aware Feature Alignment for Domain Adaptive Object Detector ECCV 2020 Video Object Detection via Object-level Temporal Aggregation ECCV 2020 Self-supervised Single-view 3D Reconstruction via Semantic Consistency ECCV 2020 Inserting Videos Into Videos CVPR 2019 Learning Linear Transformations for Fast Image and Video Style Transfer CVPR 2019 Depth-Aware Video Frame Interpolation CVPR 2019 CrDoCo: Pixel-Level Domain Transfer With Cross-Domain Consistency CVPR 2019 Spatially Variant Linear Representation Models for Joint Filtering CVPR 2019 Im2Pencil: Controllable Pencil Illustration From Photographs CVPR 2019 Mode Seeking Generative Adversarial Networks for Diverse Image Synthesis CVPR 2019 Target-Aware Deep Tracking CVPR 2019 SCOPS: Self-Supervised Co-Part Segmentation CVPR 2019 Eidetic 3D LSTM: A Model for Video Prediction and Beyond ICLR 2019 Learning Attribute-Specific Representations for Visual Tracking AAAI 2019 Quadratic Video Interpolation NIPS 2019 Dancing to Music NIPS 2019 Joint-task Self-supervised Learning for Temporal Correspondence NIPS 2019 Putting Humans in a Scene: Learning Affordance in 3D Indoor Environments CVPR 2019 DFT-based Transformation Invariant Pooling Layer for Visual Classification ECCV 2018 Sub-GAN: An Unsupervised Generative Model via Subspaces ECCV 2018 Deep Regression Tracking with Shrinkage Loss ECCV 2018 Rendering Portraitures from Monocular Camera and Beyond ECCV 2018 Superpixel Sampling Networks ECCV 2018 A Closed-form Solution to Photorealistic Image Stylization ECCV 2018 Deep Semantic Face Deblurring CVPR 2018 Online Multi-Object Tracking with Dual Matching Attention Networks ECCV 2018 Learning Data Terms for Non-blind Deblurring ECCV 2018 Deep Non-Blind Deconvolution via Generalized Low-Rank Approximation NIPS 2018 Deep Attentive Tracking via Reciprocative Learning NIPS 2018 Context-aware Synthesis and Placement of Object Instances NIPS 2018 Unsupervised holistic image generation from key local patches ECCV 2018 Weakly Supervised Coupled Networks for Visual Sentiment Analysis CVPR 2018 Learning Spatial-Temporal Regularized Correlation Filters for Visual Tracking CVPR 2018 Learning to Localize Sound Source in Visual Scenes CVPR 2018 Correlation Tracking via Joint Discrimination and Reliability Learning CVPR 2018 Learning Superpixels With Segmentation-Aware Affinity Loss CVPR 2018 Dynamic Scene Deblurring Using Spatially Variant Recurrent Neural Networks CVPR 2018 SPLATNet: Sparse Lattice Networks for Point Cloud Processing CVPR 2018 Learning Dual Convolutional Neural Networks for Low-Level Vision CVPR 2018 PiCANet: Learning Pixel-Wise Contextual Attention for Saliency Detection CVPR 2018 Gated Fusion Network for Single Image Dehazing CVPR 2018 Switchable Temporal Propagation Network ECCV 2018 Learning to Adapt Structured Output Space for Semantic Segmentation CVPR 2018 Learning Blind Video Temporal Consistency ECCV 2018 Learning to Blend Photos ECCV 2018 Fast and Accurate Online Video Object Segmentation via Tracking Parts CVPR 2018 Learning a Discriminative Prior for Blind Image Deblurring CVPR 2018 Diverse Image-to-Image Translation via Disentangled Representations ECCV 2018 Flow-Grounded Spatial-Temporal Video Prediction from Still Images ECCV 2018 Super SloMo: High Quality Estimation of Multiple Intermediate Frames for Video Interpolation CVPR 2018 VITAL: VIsual Tracking via Adversarial Learning CVPR 2018 Learning Spatial-Aware Regressions for Visual Tracking CVPR 2018 Learning Discriminative Data Fitting Functions for Blind Image Deblurring ICCV 2017 Learning to Super-Resolve Blurry Face and Text Images ICCV 2017 Unsupervised Representation Learning by Sorting Sequences ICCV 2017 SegFlow: Joint Learning for Video Object Segmentation and Optical Flow ICCV 2017 Video Deblurring via Semantic Segmentation and Pixel-Wise Non-Linear Kernel ICCV 2017 Blind Image Deblurring With Outlier Handling ICCV 2017 CREST: Convolutional Residual Learning for Visual Tracking ICCV 2017 Scene Parsing With Global Context Embedding ICCV 2017 Unsupervised Domain Adaptation for Face Recognition in Unlabeled Videos ICCV 2017 Referring Expression Generation and Comprehension via Attributes ICCV 2017 Semi-Supervised Learning for Optical Flow with Generative Adversarial Networks NIPS 2017 Learning Fully Convolutional Networks for Iterative Non-Blind Deconvolution CVPR 2017 Deep Image Harmonization CVPR 2017 Deep Laplacian Pyramid Networks for Fast and Accurate Super-Resolution CVPR 2017 Universal Style Transfer via Feature Transforms NIPS 2017 Learning Affinity via Spatial Propagation Networks NIPS 2017 Multi-Task Correlation Particle Filter for Robust Object Tracking CVPR 2017 Diversified Texture Synthesis With Feed-Forward Networks CVPR 2017 Generative Face Completion CVPR 2017 Soft-Segmentation Guided Object Motion Deblurring CVPR 2016 Object Contour Detection With a Fully Convolutional Encoder-Decoder Network CVPR 2016 Hedged Deep Tracking CVPR 2016 Object Tracking via Dual Linear Structured SVM and Explicit Feature Map CVPR 2016 Video Segmentation via Object Flow CVPR 2016 Weakly Supervised Object Localization With Progressive Domain Adaptation CVPR 2016 Robust Kernel Estimation With Outliers Handling for Image Deblurring CVPR 2016 Image Deblurring Using Smartphone Inertial Sensors CVPR 2016 A Comparative Study for Single Image Blind Deblurring CVPR 2016 Blind Image Deblurring Using Dark Channel Prior CVPR 2016 Online Multi-Object Tracking via Structural Constraint Event Aggregation CVPR 2016 Hierarchical Convolutional Features for Visual Tracking ICCV 2015 Deep Networks for Saliency Detection via Local Estimation and Global Search CVPR 2015 Multi-Objective Convolutional Learning for Face Labeling CVPR 2015 Multi-Instance Object Segmentation With Occlusion Handling CVPR 2015 Long-Term Correlation Tracking CVPR 2015 Weakly-supervised Disentangling with Recurrent Transformations for 3D View Synthesis NIPS 2015 JOTS: Joint Online Tracking and Segmentation CVPR 2015 Fast and Accurate Head Pose Estimation via Random Projection Forests ICCV 2015 What Makes an Object Memorable? ICCV 2015 Structural Sparse Tracking CVPR 2015 Adaptive Region Pooling for Object Detection CVPR 2015 PatchCut: Data-Driven Object Segmentation via Local Shape Transfer CVPR 2015 Salient Object Detection via Bootstrap Learning CVPR 2015 Context Driven Scene Parsing with Attention to Rare Classes CVPR 2014 Deblurring Low-light Images with Light Streaks CVPR 2014 Max-Margin Boltzmann Machines for Object Segmentation CVPR 2014 Joint Depth Estimation and Camera Shake Removal from Single Blurry Image CVPR 2014 Deblurring Text Images via L0-Regularized Intensity and Gradient Prior CVPR 2014 A New Image Quality Metric for Image Auto-denoising ICCV 2013 Fast Direct Super-Resolution by Simple Functions ICCV 2013 Saliency Detection via Dense and Sparse Reconstruction ICCV 2013 Saliency Detection via Absorbing Markov Chain ICCV 2013 Visual Tracking via Locality Sensitive Histograms CVPR 2013 Structured Face Hallucination CVPR 2013 Saliency Detection via Graph-Based Manifold Ranking CVPR 2013 Online Object Tracking: A Benchmark CVPR 2013 Least Soft-Threshold Squares Tracking CVPR 2013 Exemplar Cut ICCV 2013 Multiple Non-rigid Surface Detection and Registration ICCV 2013 Detecting Humans via Their Pose NIPS 2006