Dong Wang

183 papers · 2010–2026 · 19 conferences · across top CS/AI conferences

Achievements

+15 more ↓

🧭 Keyword Pioneer 🗺️ Taxonomy Completionist (22) 🌈 Renaissance Researcher (6) 🌉 Interdisciplinary Bridge 🐣 Hot Topic Early Bird

🌍 Conference Polyglot (19) 🗺️ Taxonomy Completionist (22) 🐝 Cross-Pollinator (11) 🏠 Conference Loyalist (21) 🔬 Deep Specialist (30) 🏆 Keyword Champion 🏆 Grand Slam 🌱 Topic Pioneer 🤝 Dynamic Duo (40) 📈 Trend Setter 🚀 Conference Pioneer 🔥 Unstoppable (16) 💎 Century Club (181) 🗃️ Keyword Collector (68) ⚡ Prolific Year (32)

Conferences

CVPR (37) INTERSPEECH (25) AAAI (23) ICCV (20) IJCAI (15) ACL (12) NIPS (9) EMNLP (9) ECCV (8) ICLR (5) COLING (4) NAACL (4) CORL (3) ICML (2) RSS (2) WACV (2) IJCNLP (1) MICCAI (1) ACML (1)

Top co-authors

Huchuan Lu (41) Bin Zhao (29) Xuelong Li (25) Zhigang Wang (22) Lantian Li (18) Yang Zhang (17) Huimin Zeng (15) Zhenrui Yue (15) Lanyu Shang (12) Bin Yan (9)

Research topics

Robotics (1) Privacy (1)

Keywords

object tracking (15) visual tracking (12) convolutional neural network (11) neural network (9) self-supervised learning (8) visual object tracking (7) attention mechanism (7) contrastive learning (7) model compression (7) speaker recognition (7) data augmentation (6) representation learning (6) image restoration (6) transformer architecture (5) siamese network (5) domain adaptation (5) recommender system (5) large language model (5) few-shot learning (5) object detection (5)

Papers

FreeGaussian: Annotation-free Control of Articulated Objects via 3D Gaussian Splats with Flow Derivatives AAAI 2026 CADTrack: Learning Contextual Aggregation with Deformable Alignment for Robust RGBT Tracking AAAI 2026 Efficient Diffusion as Low Light Enhancer CVPR 2025 DGM: Disentangled Generative Model for Detecting AD Individualized Pathological Changes via Pseudo-Healthy Synthesis MICCAI 2025 Phoenix: A Motion-based Self-Reflection Framework for Fine-grained Robotic Action Correction CVPR 2025 FastUMI: A Scalable and Hardware-Independent Universal Manipulation Interface with Dataset CORL 2025 SpatialVLA: Exploring Spatial Representations for Visual-Language-Action Models RSS 2025 SUTrack: Towards Simple and Unified Single Object Tracking AAAI 2025 Logical DA: Enhancing Data Augmentation for Logical Reasoning via a Multi-Agent System ACL 2025 Learning 2D Invariant Affordance Knowledge for 3D Affordance Grounding AAAI 2025 Bidirectional Human–AI Collaboration for Equitable Student Performance Prediction via Deep Uncertainty Learning IJCAI 2025 MPPQ: Enhancing Post-Training Quantization for LLMs via Mixed Supervision, Proxy Rounding, and Pre-Searching IJCAI 2025 Efficient Motion Prompt Learning for Robust Visual Tracking ICML 2025 Forget the Data and Fine-Tuning! Just Fold the Network to Compress ICLR 2025 A Non-Contrastive Learning Framework for Sequential Recommendation with Preference-Preserving Profile Generation ICLR 2025 Inference Scaling for Long-Context Retrieval Augmented Generation ICLR 2025 Exploring Enhanced Contextual Information for Video-Level Object Tracking AAAI 2025 CAT: A Unified Click-and-Track Framework for Realistic Tracking ICCV 2025 AerialVG: A Challenging Benchmark for Aerial Visual Grounding by Exploring Positional Relations ICCV 2025 VehicleMAE: View-asymmetry Mutual Learning for Vehicle Re-identification Pre-training via Masked AutoEncoders ICCV 2025 MoMa-Kitchen: A 100K+ Benchmark for Affordance-Grounded Last-Mile Navigation in Mobile Manipulation ICCV 2025 Anti-Tamper Protection for Unauthorized Individual Image Generation ICCV 2025 Open-Vocabulary Octree-Graph for 3D Scene Understanding ICCV 2025 Zero-Shot Cross-Domain Aspect-Based Sentiment Analysis via Domain-Contextualized Chain-of-Thought Reasoning EMNLP 2025 TALON: A Multi-Agent Framework for Long-Table Exploration and Question Answering EMNLP 2025 ProcWorld: Benchmarking Large Model Planning in Reachability-Constrained Environments EMNLP 2025 ConCISE: Confidence-guided Compression in Step-by-step Efficient Reasoning EMNLP 2025 CLIP-driven View-aware Prompt Learning for Unsupervised Vehicle Re-identification AAAI 2025 Two-stream Beats One-stream: Asymmetric Siamese Network for Efficient Visual Tracking AAAI 2025 Knowledge Graph Completion with Relation-Aware Anchor Enhancement AAAI 2025 SIDE: Socially Informed Drought Estimation Toward Understanding Societal Impact Dynamics of Environmental Crisis AAAI 2025 DeformCL: Learning Deformable Centerline Representation for Vessel Extraction in 3D Medical Image CVPR 2025 Think Small, Act Big: Primitive Prompt Learning for Lifelong Robot Manipulation CVPR 2025 Improving Transferable Targeted Attacks with Feature Tuning Mixup CVPR 2025 GraphMorph: Tubular Structure Extraction by Morphing Predicted Graphs NIPS 2024 KOI: Accelerating Online Imitation Learning via Hybrid Key-state Guidance CORL 2024 Learning Manipulation by Predicting Interaction RSS 2024 Open-Vocabulary Federated Learning with Multimodal Prototyping NAACL 2024 Evidence-Driven Retrieval Augmented Response Generation for Online Misinformation NAACL 2024 Implicit Event-RGBD Neural SLAM CVPR 2024 HPL-ESS: Hybrid Pseudo-Labeling for Unsupervised Event-based Semantic Segmentation CVPR 2024 Boosting Continual Learning of Vision-Language Models via Mixture-of-Experts Adapters CVPR 2024 GS-SLAM: Dense Visual SLAM with 3D Gaussian Splatting CVPR 2024 LiveScene: Language Embedding Interactive Radiance Fields for Physical Scene Control and Rendering NIPS 2024 LLMs Can Evolve Continually on Modality for $\mathbb{X}$-Modal Reasoning NIPS 2024 Box2Poly: Memory-Efficient Polygon Prediction of Arbitrarily Shaped and Rotated Text AAAI 2024 Color Event Enhanced Single-Exposure HDR Imaging AAAI 2024 X4D-SceneFormer: Enhanced Scene Understanding on 4D Point Cloud Videos through Cross-Modal Knowledge Transfer AAAI 2024 Point-PEFT: Parameter-Efficient Fine-Tuning for 3D Pre-trained Models AAAI 2024 Hybrid-SORT: Weak Cues Matter for Online Multi-Object Tracking AAAI 2024 Few-Shot Keyword Spotting from Mixed Speech INTERSPEECH 2024 Zero-Shot Fake Video Detection by Audio-Visual Consistency INTERSPEECH 2024 UY/CH-CHILD -- A Public Chinese L2 Speech Database of Uyghur Children INTERSPEECH 2024 A Comprehensive Investigation on Speaker Augmentation for Speaker Recognition INTERSPEECH 2024 SE/BN Adapter: Parametric Efficient Domain Adaptation for Speaker Recognition INTERSPEECH 2024 CNVSRC 2023: The First Chinese Continuous Visual Speech Recognition Challenge INTERSPEECH 2024 Pinyin Regularization in Error Correction for Chinese Speech Recognition with Large Language Models INTERSPEECH 2024 EvSign: Sign Language Recognition and Translation with Streaming Events ECCV 2024 Leveraging the Power of Data Augmentation for Transformer-Based Tracking WACV 2024 Retrieval Augmented Fact Verification by Synthesizing Contrastive Arguments ACL 2024 Fair Federated Learning with Biased Vision-Language Models ACL 2024 Serialized Output Training by Learned Dominance INTERSPEECH 2024 Off-Policy Primal-Dual Safe Reinforcement Learning ICLR 2024 Train Once, Deploy Anywhere: Matryoshka Representation Learning for Multimodal Recommendation EMNLP 2024 Any2Point: Empowering Any-modality Transformers for Efficient 3D Understanding ECCV 2024 CN-Celeb-AV: A Multi-Genre Audio-Visual Dataset for Person Recognition INTERSPEECH 2023 Diffusion Model is an Effective Planner and Data Synthesizer for Multi-Task Reinforcement Learning NIPS 2023 Cross-Domain Policy Adaptation via Value-Guided Data Filtering NIPS 2023 Affordance-Driven Next-Best-View Planning for Robotic Grasping CORL 2023 Dual Memory Aggregation Network for Event-Based Object Detection with Learnable Representation AAAI 2023 Decision-Making Context Interaction Network for Click-Through Rate Prediction AAAI 2023 Direct Heterogeneous Causal Learning for Resource Allocation Problems in Marketing AAAI 2023 A Crowd-AI Collaborative Duo Relational Graph Learning Framework towards Social Impact Aware Photo Classification AAAI 2023 MetaAdapt: Domain Adaptive Few-Shot Misinformation Detection via Meta Learning ACL 2023 Zero- and Few-Shot Event Detection via Prompt-Based Meta Learning ACL 2023 Fully Self-Supervised Depth Estimation From Defocus Clue CVPR 2023 Representation Learning for Visual Object Tracking by Masked Appearance Transfer CVPR 2023 SeqTrack: Sequence to Sequence Learning for Visual Object Tracking CVPR 2023 Universal Instance Perception As Object Discovery and Retrieval CVPR 2023 Towards Benchmarking and Assessing Visual Naturalness of Physical World Adversarial Attacks CVPR 2023 Visual Prompt Multi-Modal Tracking CVPR 2023 One-Shot High-Fidelity Talking-Head Synthesis With Deformable Neural Radiance Field CVPR 2023 Propagate and Calibrate: Real-Time Passive Non-Line-of-Sight Tracking CVPR 2023 KEPL: Knowledge Enhanced Prompt Learning for Chinese Hypernym-Hyponym Extraction EMNLP 2023 HSR-Diff: Hyperspectral Image Super-Resolution via Conditional Diffusion Models ICCV 2023 Towards Nonlinear-Motion-Aware and Occlusion-Robust Rolling Shutter Correction ICCV 2023 ViewRefer: Grasp the Multi-view Knowledge for 3D Visual Grounding ICCV 2023 Exploring Lightweight Hierarchical Vision Transformers for Efficient Visual Tracking ICCV 2023 Not All Features Matter: Enhancing Few-shot CLIP with Adaptive Prior Refinement ICCV 2023 Safe Offline Reinforcement Learning with Real-Time Budget Constraints ICML 2023 On Adversarial Robustness of Demographic Fairness in Face Attribute Recognition IJCAI 2023 On Optimizing Model Generality in AI-based Disaster Damage Assessment: A Subjective Logic-driven Crowd-AI Hybrid Learning Approach IJCAI 2023 Spot Keywords From Very Noisy and Mixed Speech INTERSPEECH 2023 FTA-net: A Frequency and Time Attention Network for Speech Depression Detection INTERSPEECH 2023 Visualizing Data Augmentation in Deep Speaker Recognition INTERSPEECH 2023 Ordered and Binary Speaker Embedding INTERSPEECH 2023 A Multi-Scale Attentive Transformer for Multi-Instrument Symbolic Music Generation INTERSPEECH 2023 MultiQuant: Training Once for Multi-bit Quantization of Neural Networks IJCAI 2022 Feature Dense Relevance Network for Single Image Dehazing IJCAI 2022 Balanced Multimodal Learning via On-the-Fly Gradient Modulation CVPR 2022 Show, Deconfound and Tell: Image Captioning With Causal Inference CVPR 2022 Visible-Thermal UAV Tracking: A Large-Scale Benchmark and New Baseline CVPR 2022 A Copy-Augmented Generative Model for Open-Domain Question Answering ACL 2022 Point-M2AE: Multi-scale Masked Autoencoders for Hierarchical Point Cloud Pre-training NIPS 2022 Tight Mutual Information Estimation With Contrastive Fenchel-Legendre Optimization NIPS 2022 Learning to Weight Filter Groups for Robust Classification WACV 2022 Oriental Language Recognition (OLR) 2021: Summary and Analysis INTERSPEECH 2022 ECAPA-TDNN Based Depression Detection from Clinical Speech INTERSPEECH 2022 Reliable Visualization for Deep Speaker Recognition INTERSPEECH 2022 Gradient Importance Learning for Incomplete Observations ICLR 2022 QA Domain Adaptation using Hidden Space Augmentation and Self-Supervised Contrastive Adaptation EMNLP 2022 Domain Adaptation for Question Answering via Question Classification COLING 2022 Crowd, Expert & AI: A Human-AI Interactive Approach Towards Natural Language Explanation Based COVID-19 Misinformation Detection IJCAI 2022 On Attacking Out-Domain Uncertainty Estimation in Deep Neural Networks IJCAI 2022 D-DPCC: Deep Dynamic Point Cloud Compression via 3D Motion Prediction IJCAI 2022 Towards Grand Unification of Object Tracking ECCV 2022 Check and Link: Pairwise Lesion Correspondence Guides Mammogram Mass Detection ECCV 2022 PointScatter: Point Set Representation for Tubular Structure Extraction ECCV 2022 User Retention: A Causal Approach with Triple Task Modeling IJCAI 2021 A Streaming End-to-End Framework For Spoken Language Understanding IJCAI 2021 CLINE: Contrastive Learning with Semantic Negative Examples for Natural Language Understanding ACL 2021 LightTrack: Finding Lightweight Neural Networks for Object Tracking via One-Shot Architecture Search CVPR 2021 Alpha-Refine: Boosting Tracking Performance by Precise Bounding Box Estimation CVPR 2021 eTREE: Learning Tree-structured Embeddings AAAI 2021 Temporal Relational Modeling with Self-Supervision for Action Segmentation AAAI 2021 CLINE: Contrastive Learning with Semantic Negative Examples for Natural Language Understanding IJCNLP 2021 Wasserstein Contrastive Representation Distillation CVPR 2021 Voting for the Right Answer: Adversarial Defense for Speaker Verification INTERSPEECH 2021 Transformer Tracking CVPR 2021 Pyramid Spatial-Temporal Aggregation for Video-Based Person Re-Identification ICCV 2021 Learning Spatio-Temporal Transformer for Visual Tracking ICCV 2021 Video Annotation for Visual Tracking via Selection and Refinement ICCV 2021 Oriental Language Recognition (OLR) 2020: Summary and Analysis INTERSPEECH 2021 ASR-Free Pronunciation Assessment INTERSPEECH 2020 High-Performance Long-Term Tracking With Meta-Updater CVPR 2020 FocalMix: Semi-Supervised Learning for 3D Medical Image Detection CVPR 2020 Cooling-Shrinking Attack: Blinding the Tracker With Imperceptible Noises CVPR 2020 Integrating User History into Heterogeneous Graph for Dialogue Act Recognition COLING 2020 Knowledge and Cross-Pair Pattern Guided Semantic Matching for Question Answering AAAI 2020 Crowd-Assisted Disaster Scene Assessment with Human-AI Interactive Attention AAAI 2020 Summarize before Aggregate: A Global-to-local Heterogeneous Graph Inference Network for Conversational Emotion Recognition COLING 2020 Neural Discriminant Analysis for Deep Speaker Embedding INTERSPEECH 2020 Domain-Invariant Speaker Vector Projection by Model-Agnostic Meta-Learning INTERSPEECH 2020 Conversational Word Embedding for Retrieval-Based Dialog System ACL 2020 A Mutual Learning Method for Salient Object Detection With Intertwined Multi-Supervision CVPR 2019 VAE-Based Regularization for Deep Speaker Embedding INTERSPEECH 2019 Visual Tracking via Adaptive Spatially-Regularized Correlation Filters CVPR 2019 Memory-Augmented Temporal Dynamic Learning for Action Recognition AAAI 2019 ROI Pooled Correlation Filters for Visual Tracking CVPR 2019 Listen to the Image CVPR 2019 On Fenchel Mini-Max Learning NIPS 2019 GradNet: Gradient-Guided Network for Visual Object Tracking ICCV 2019 'Skimming-Perusal' Tracking: A Framework for Real-Time and Robust Long-Term Tracking ICCV 2019 Exploiting Persona Information for Diverse Generation of Conversational Responses IJCAI 2019 Latent Distribution Preserving Deep Subspace Clustering IJCAI 2019 CFM: Convolutional Factorization Machines for Context-Aware Recommendation IJCAI 2019 Learning Spatial-Aware Regressions for Visual Tracking CVPR 2018 Learning to Navigate for Fine-grained Classification ECCV 2018 Real-time 'Actor-Critic' Tracking ECCV 2018 Structured Siamese Network for Real-Time Visual Tracking ECCV 2018 BRITS: Bidirectional Recurrent Imputation for Time Series NIPS 2018 Correlation Tracking via Joint Discrimination and Reliability Learning CVPR 2018 Defocus Blur Detection via Multi-Stream Bottom-Top-Bottom Fully Convolutional Network CVPR 2018 Stepwise Metric Promotion for Unsupervised Video Person Re-Identification ICCV 2017 Deep Speaker Feature Learning for Text-Independent Speaker Verification INTERSPEECH 2017 A Study on Replay Attack and Anti-Spoofing for Automatic Speaker Verification INTERSPEECH 2017 Learning Uncertain Convolutional Features for Accurate Saliency Detection ICCV 2017 Memory-augmented Neural Machine Translation EMNLP 2017 Flexible and Creative Chinese Poetry Generation Using Neural Memory ACL 2017 Learning to Detect Salient Objects With Image-Level Supervision CVPR 2017 Amulet: Aggregating Multi-Level Convolutional Features for Salient Object Detection ICCV 2017 PHD: A Probabilistic Model of Hybrid Deep Collaborative Filtering for Recommender Systems ACML 2017 Discourse Mode Identification in Essays ACL 2017 Chinese Song Iambics Generation with Neural Attention-Based Model IJCAI 2016 Normalized Word Embedding and Orthogonal Transform for Bilingual Word Translation NAACL 2015 Low-Rank Matrix Factorization Under General Mixture Noise Distributions ICCV 2015 Stochastic Top-k ListNet EMNLP 2015 Visual Tracking via Probability Continuous Outlier Model CVPR 2014 Least Soft-Threshold Squares Tracking CVPR 2013 Tweet Ranking Based on Heterogeneous Networks COLING 2012 A Two-step Approach to Sentence Compression of Spoken Utterances ACL 2012 A Pilot Study of Opinion Summarization in Conversations ACL 2011 Improving Blog Polarity Classification via Topic Analysis and Adaptive Methods NAACL 2010