conftrace_

Jingdong Wang

153 papers · 2013–2026 · 8 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓
+18 more ↓ 🧭 Keyword Pioneer πŸ—ΊοΈ Taxonomy Completionist (17) πŸŒ‰ Interdisciplinary Bridge 🌈 Renaissance Researcher (6) 🐣 Hot Topic Early Bird
πŸŒ‰ Interdisciplinary Bridge 🐣 Hot Topic Early Bird πŸ—ΊοΈ Taxonomy Completionist (17) 🌟 Keyword Trendsetter Combo (5) 🏠 Conference Loyalist (20) 🌱 Topic Pioneer πŸ”¬ Deep Specialist (21) 🧬 Topic Evolution πŸ† Keyword Champion (3) 🀝 Dynamic Duo (51) πŸ† Grand Slam πŸ’Ž Century Club (152) πŸ“ˆ Trend Setter πŸš€ Conference Pioneer ⚑ Prolific Year (30) πŸ”₯ Unstoppable (13) ❓ The Questioner (3) πŸ—ƒοΈ Keyword Collector (526)

Conferences

CVPR (56) ICCV (27) ECCV (22) NIPS (20) AAAI (9) ICLR (7) ICML (6) IJCAI (6)

Papers

EM-KD: Distilling Efficient Multimodal Large Language Model with Unbalanced Vision Tokens AAAI 2026 SpotActor: Training-Free Layout-Controlled Consistent Image Generation AAAI 2025 Interpretable Face Anti-Spoofing: Enhancing Generalization with Multimodal Large Language Models AAAI 2025 DynaMind: Reasoning over Abstract Video Dynamics for Embodied Decision-Making ICML 2025 OpenHumanVid: A Large-Scale High-Quality Dataset for Enhancing Human-Centric Video Generation CVPR 2025 Are Images Indistinguishable to Humans Also Indistinguishable to Classifiers? CVPR 2025 Manifold Constraint Reduces Exposure Bias in Accelerated Diffusion Sampling ICLR 2025 MGMapNet: Multi-Granularity Representation Learning for End-to-End Vectorized HD Map Construction ICLR 2025 Hallo2: Long-Duration and High-Resolution Audio-Driven Portrait Image Animation ICLR 2025 Low-Biased General Annotated Dataset Generation CVPR 2025 Re-HOLD: Video Hand Object Interaction Reenactment via adaptive Layout-instructed Diffusion Model CVPR 2025 Action Detail Matters: Refining Video Recognition with Local Action Queries CVPR 2025 AudCast: Audio-Driven Human Video Generation by Cascaded Diffusion Transformers CVPR 2025 VoxelSplat: Dynamic Gaussian Splatting as an Effective Loss for Occupancy and Flow Prediction CVPR 2025 Continual SFT Matches Multimodal RLHF with Negative Supervision CVPR 2025 Hallo3: Highly Dynamic and Realistic Portrait Image Animation with Video Diffusion Transformer CVPR 2025 TexGarment: Consistent Garment UV Texture Generation via Efficient 3D Structure-Guided Diffusion Transformer CVPR 2025 VidEvo: Evolving Video Editing through Exhaustive Temporal Modeling IJCAI 2025 Schedule Your Edit: A Simple yet Effective Diffusion Noise Schedule for Image Editing NIPS 2024 Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation AAAI 2024 SSMG: Spatial-Semantic Map Guided Diffusion Model for Free-Form Layout-to-Image Generation AAAI 2024 Multi-Domain Incremental Learning for Face Presentation Attack Detection AAAI 2024 A Multimodal, Multi-Task Adapting Framework for Video Action Recognition AAAI 2024 Mobile Attention: Mobile-Friendly Linear-Attention for Vision Transformers ICML 2024 Towards Unified Multi-granularity Text Detection with Interactive Attention ICML 2024 BEVSpread: Spread Voxel Pooling for Bird's-Eye-View Representation in Vision-based Roadside 3D Object Detection CVPR 2024 GGRt: Towards Generalizable 3D Gaussians without Pose Priors in Real-Time ECCV 2024 Timestep-Aware Correction for Quantized Diffusion Models ECCV 2024 Make Your ViT-based Multi-view 3D Detectors Faster via Token Compression ECCV 2024 ReSyncer: Rewiring Style-based Generator for Unified Audio-Visually Synced Facial Performer ECCV 2024 OPEN: Object-wise Position Embedding for Multi-view 3D Object Detection ECCV 2024 Automated Multi-level Preference for MLLMs NIPS 2024 MoLE: Enhancing Human-centric Text-to-image Diffusion via Mixture of Low-rank Experts NIPS 2024 Dense Connector for MLLMs NIPS 2024 PLIP: Language-Image Pre-training for Person Representation Learning NIPS 2024 ShowMaker: Creating High-Fidelity 2D Human Video via Fine-Grained Diffusion Modeling NIPS 2024 Flipped Classroom: Aligning Teacher Attention with Student in Generalized Category Discovery NIPS 2024 Octopus: A Multi-modal LLM with Parallel Recognition and Sequential Understanding NIPS 2024 SEED: A Simple and Effective 3D DETR in Point Clouds ECCV 2024 IRGen: Generative Modeling for Image Retrieval ECCV 2024 Interactive 3D Object Detection with Prompts ECCV 2024 LaMI-DETR: Open-Vocabulary Detection with Language Model Instruction ECCV 2024 Evaluation of Text-to-Video Generation Models: A Dynamics Perspective NIPS 2024 Let the Avatar Talk using Texts without Paired Training Data ECCV 2024 LION: Linear Group RNN for 3D Object Detection in Point Clouds NIPS 2024 OpenGaussian: Towards Point-Level 3D Gaussian-based Open Vocabulary Understanding NIPS 2024 Forgery-aware Adaptive Transformer for Generalizable Synthetic Image Detection CVPR 2024 Decoupled Pseudo-labeling for Semi-Supervised Monocular 3D Object Detection CVPR 2024 VRP-SAM: SAM with Visual Reference Prompt CVPR 2024 MS-DETR: Efficient DETR Training with Mixed Supervision CVPR 2024 GP-NeRF: Generalized Perception NeRF for Context-Aware 3D Scene Understanding CVPR 2024 Learning to Rematch Mismatched Pairs for Robust Cross-Modal Retrieval CVPR 2024 Leveraging Vision-Centric Multi-Modal Expertise for 3D Object Detection NIPS 2023 HAP: Structure-Aware Masked Image Modeling for Human-Centric Perception NIPS 2023 DAC-DETR: Divide the Attention Layers and Conquer NIPS 2023 Boosting Few-shot Action Recognition with Graph-guided Hybrid Matching ICCV 2023 CPCM: Contextual Point Cloud Modeling for Weakly-supervised Point Cloud Semantic Segmentation ICCV 2023 Augmentation Matters: A Simple-Yet-Effective Approach to Semi-Supervised Semantic Segmentation CVPR 2023 Instance-Specific and Model-Adaptive Supervision for Semi-Supervised Semantic Segmentation CVPR 2023 Ambiguity-Resistant Semi-Supervised Learning for Dense Object Detection CVPR 2023 Bidirectional Cross-Modal Knowledge Exploration for Video Recognition With Pre-Trained Vision-Language Models CVPR 2023 StyleSync: High-Fidelity Generalized and Personalized Lip Sync in Style-Based Generator CVPR 2023 CAPE: Camera View Position Embedding for Multi-View 3D Object Detection CVPR 2023 PSVT: End-to-End Multi-Person 3D Pose and Shape Estimation With Progressive Video Transformers CVPR 2023 Beyond Attentive Tokens: Incorporating Token Importance and Diversity for Efficient Vision Transformers CVPR 2023 Semi-DETR: Semi-Supervised Object Detection With Detection Transformers CVPR 2023 Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval? CVPR 2023 Forward Flow for Novel View Synthesis of Dynamic Scenes ICCV 2023 Delicate Textured Mesh Recovery from NeRF via Adaptive Surface Refinement ICCV 2023 StrucTexTv2: Masked Visual-Textual Prediction for Document Image Pre-training ICLR 2023 Graph Contrastive Learning for Skeleton-based Action Recognition ICLR 2023 What Can Simple Arithmetic Operations Do for Temporal Modeling? ICCV 2023 Cyclically Disentangled Feature Translation for Face Anti-spoofing AAAI 2023 CFCG: Semi-Supervised Semantic Segmentation via Cross-Fusion and Contour Guidance Supervision ICCV 2023 Task-Oriented Multi-Modal Mutual Leaning for Vision-Language Models ICCV 2023 UATVR: Uncertainty-Adaptive Text-Video Retrieval ICCV 2023 Gradient-based Sampling for Class Imbalanced Semi-supervised Object Detection ICCV 2023 Group Pose: A Simple Baseline for End-to-End Multi-Person Pose Estimation ICCV 2023 Group DETR: Fast DETR Training with Group-Wise One-to-Many Assignment ICCV 2023 Robust Video Portrait Reenactment via Personalized Representation Quantization AAAI 2023 s-Adaptive Decoupled Prototype for Few-Shot Object Detection ICCV 2023 Unified Pre-Training with Pseudo Texts for Text-To-Image Person Re-Identification ICCV 2023 Learning Versatile Neural Architectures by Propagating Network Codes ICLR 2022 Delving into Sequential Patches for Deepfake Detection NIPS 2022 RTFormer: Efficient Design for Real-Time Semantic Segmentation with Transformer NIPS 2022 Singular Value Fine-tuning: Few-shot Segmentation requires Few-parameters Fine-tuning NIPS 2022 Human-Object Interaction Detection via Disentangled Transformer CVPR 2022 Few-Shot Head Swapping in the Wild CVPR 2022 Few-Shot Font Generation by Learning Fine-Grained Local Styles CVPR 2022 MixFormer: Mixing Features Across Windows and Dimensions CVPR 2022 Expressive Talking Head Generation With Granular Audio-Visual Control CVPR 2022 Implicit Sample Extension for Unsupervised Person Re-Identification CVPR 2022 ViSTA: Vision and Scene Text Aggregation for Cross-Modal Retrieval CVPR 2022 GitNet: Geometric Prior-Based Transformation for Birds-Eye-View Segmentation ECCV 2022 Action Quality Assessment with Temporal Parsing Transformer ECCV 2022 StyleSwap: Style-Based Generator Empowers Robust Face Swapping ECCV 2022 DaViT: Dual Attention Vision Transformers ECCV 2022 UFO: Unified Feature Optimization ECCV 2022 Diverse Learner: Exploring Diverse Supervision for Semi-Supervised Object Detection ECCV 2022 CODER: Coupled Diversity-Sensitive Momentum Contrastive Learning for Image-Text Retrieval ECCV 2022 On the Connection between Local Attention and Dynamic Depth-wise Convolution ICLR 2022 Self-Guided Hard Negative Generation for Unsupervised Person Re-Identification IJCAI 2022 Conditional DETR for Fast Training Convergence ICCV 2021 HRFormer: High-Resolution Vision Transformer for Dense Predict NIPS 2021 Semi-Supervised Semantic Segmentation With Cross Pseudo Supervision CVPR 2021 Admix: Enhancing the Transferability of Adversarial Attacks ICCV 2021 Bottom-Up Human Pose Estimation via Disentangled Keypoint Regression CVPR 2021 SPANN: Highly-efficient Billion-scale Approximate Nearest Neighborhood Search NIPS 2021 Lite-HRNet: A Lightweight High-Resolution Network CVPR 2021 HigherHRNet: Scale-Aware Representation Learning for Bottom-Up Human Pose Estimation CVPR 2020 Weakly-Supervised Action Localization by Generative Attention Modeling CVPR 2020 SegFix: Model-Agnostic Boundary Refinement for Segmentation ECCV 2020 Informative Dropout for Robust Representation Learning: A Shape-bias Perspective ICML 2020 Object-Contextual Representations for Semantic Segmentation ECCV 2020 Point-Set Anchors for Object Detection, Instance Segmentation and Pose Estimation ECCV 2020 Efficient Semantic Video Segmentation with Per-frame Inference ECCV 2020 Closed-Loop Matters: Dual Regression Networks for Single Image Super-Resolution CVPR 2020 Cross View Fusion for 3D Human Pose Estimation ICCV 2019 Disparity-preserved Deep Cross-platform Association for Cross-platform Video Recommendation IJCAI 2019 Structured Knowledge Distillation for Semantic Segmentation CVPR 2019 Deep High-Resolution Representation Learning for Human Pose Estimation CVPR 2019 S4Net: Single Stage Salient-Instance Segmentation CVPR 2019 Global-Local Temporal Representations for Video Person Re-Identification ICCV 2019 Interleaved Structured Sparse Convolutional Neural Networks CVPR 2018 Global Versus Localized Generative Adversarial Nets CVPR 2018 Weakly Supervised Dense Event Captioning in Videos NIPS 2018 Deep Convolutional Neural Networks with Merge-and-Run Mappings IJCAI 2018 Part-Aligned Bilinear Representations for Person Re-Identification ECCV 2018 Weakly-Supervised Semantic Segmentation Network With Deep Seeded Region Growing CVPR 2018 Interleaved Group Convolutions ICCV 2017 Ensemble Diffusion for Retrieval ICCV 2017 Deeply-Learned Part-Aligned Representations for Person Re-Identification ICCV 2017 Human Pose Estimation Using Global and Local Normalization ICCV 2017 Random Shifting for CNN: a Solution to Reduce Information Loss in Down-Sampling Layers IJCAI 2017 DisturbLabel: Regularizing CNN on the Loss Layer CVPR 2016 Supervised Quantization for Similarity Search CVPR 2016 InterActive: Inter-Layer Activeness Propagation CVPR 2016 Collaborative Quantization for Cross-Modal Similarity Search CVPR 2016 Co-Saliency Detection via Looking Deep and Wide CVPR 2015 Similarity Learning on an Explicit Polynomial Kernel Feature Map for Person Re-Identification CVPR 2015 Person Re-Identification With Correspondence Structure Learning ICCV 2015 Quantized Correlation Hashing for Fast Cross-Modal Search IJCAI 2015 Scalable Person Re-Identification: A Benchmark ICCV 2015 RIDE: Reversal Invariant Descriptor Enhancement ICCV 2015 Sparse Composite Quantization CVPR 2015 Orientational Pyramid Matching for Recognizing Indoor Scenes CVPR 2014 Composite Quantization for Approximate Nearest Neighbor Search ICML 2014 Online Robust Non-negative Dictionary Learning for Visual Tracking ICCV 2013 Fixed-Point Model For Structured Labeling ICML 2013 Fast Neighborhood Graph Search Using Cartesian Concatenation ICCV 2013 Supervised Kernel Descriptors for Visual Recognition CVPR 2013 Salient Object Detection: A Discriminative Regional Feature Integration Approach CVPR 2013 Learning CRFs for Image Parsing with Adaptive Subgradient Descent ICCV 2013