Yan Huang

74 papers · 2015–2026 · 17 conferences · across top CS/AI conferences

Achievements

+12 more ↓

🧭 Keyword Pioneer 🗺️ Taxonomy Completionist (17) 🌈 Renaissance Researcher (5) 🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (17)

🗺️ Taxonomy Completionist (17) 🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🌟 Keyword Trendsetter Combo (4) 🏆 Grand Slam 🤝 Dynamic Duo (31) 🔥 Unstoppable (12) 💎 Century Club (72) 🚀 Conference Pioneer 📈 Trend Setter 🗃️ Keyword Collector (332) ⚡ Prolific Year (6)

Conferences

CVPR (15) AAAI (12) ICCV (10) NIPS (8) INTERSPEECH (5) ECCV (4) IJCAI (4) ACL (3) AACL (2) COLING (2) ICML (2) MICCAI (2) WACV (1) JMLR (1) IJCNLP (1) ICLR (1) EMNLP (1)

Top co-authors

Liang Wang (32) Tieniu Tan (6) Jinming Xu (5) Heng Fan (5) Wei Wang (5) Yifan Gong (5) Qiang Wu (4) Wanli Ouyang (4) Eduardo Blanco (3) Zehan Zhu (3)

Research topics

Computer Vision (1) Differential Privacy (1)

Keywords

recurrent neural network (5) representation learning (5) person re-identification (5) domain adaptation (4) attention mechanism (4) multimodal learning (4) image sentence matching (3) text classification (3) image-sentence matching (3) reinforcement learning (3) convolutional neural network (3) zero-shot learning (3) long short-term memory (3) metric learning (3) vision-language navigation (3) weakly supervised learning (3) speech recognition (3) acoustic model (3) sparse coding (2) differential privacy (2)

Papers

SymNet: A Multi-Task Network for Joint Radio Map Reconstruction and Transmitter Localization WACV 2026 Gait Transformer: End-to-End Transformer Backbone for Gait Recognition AAAI 2026 Enhancing Generalization of Depth Estimation Foundation Model via Weakly-Supervised Adaptation with Regularization AAAI 2026 DAM: Dynamic Attention Mask for Long-Context Large Language Model Inference Acceleration ACL 2025 Open-Vocabulary Octree-Graph for 3D Scene Understanding ICCV 2025 Knowing Your Target: Target-Aware Transformer Makes Better Spatio-Temporal Video Grounding ICLR 2025 Enhanced Visual-Semantic Interaction with Tailored Prompts for Pedestrian Attribute Recognition CVPR 2025 PRVQL: Progressive Knowledge-guided Refinement for Robust Egocentric Visual Query Localization ICCV 2025 Learning Fine-Grained Alignment for Aerial Vision-Dialog Navigation AAAI 2025 Frequency-domain Multi-modal Fusion for Language-guided Medical Image Segmentation MICCAI 2025 RoLocMe: A Robust Multi-agent Source Localization System with Learning-based Map Estimation IJCAI 2025 Dyn-D^2P: Dynamic Differentially Private Decentralized Learning with Provable Utility Guarantee IJCAI 2025 Distributed Stochastic Bilevel Optimization: Improved Complexity and Heterogeneity Analysis JMLR 2025 DATA: Domain-And-Time Alignment for High-Quality Feature Fusion in Collaborative Perception ICCV 2025 EC-Flow: Enabling Versatile Robotic Manipulation from Action-Unlabeled Videos via Embodiment-Centric Flow ICCV 2025 HGSFusion: Radar-Camera Fusion with Hybrid Generation and Synchronization for 3D Object Detection AAAI 2025 Zero-Shot Low-Light Image Enhancement via Latent Diffusion Models AAAI 2025 SPENet: Self-guided Prototype Enhancement Network for Few-shot Medical Image Segmentation MICCAI 2025 Enhancing Marker Scoring Accuracy through Ordinal Confidence Modelling in Educational Assessments ACL 2025 Free Lunch for Gait Recognition: A Novel Relation Descriptor ECCV 2024 Everyday Object Meets Vision-and-Language Navigation Agent via Backdoor NIPS 2024 TDeLTA: A Light-Weight and Robust Table Detection Method Based on Learning Text Arrangement AAAI 2024 Selective and Orthogonal Feature Activation for Pedestrian Attribute Recognition AAAI 2024 Analyzing Large Language Models’ Capability in Location Prediction COLING 2024 PrivSGP-VR: Differentially Private Variance-Reduced Stochastic Gradient Push with Tight Utility Bounds IJCAI 2024 Context-Guided Spatio-Temporal Video Grounding CVPR 2024 Attribute-Guided Pedestrian Retrieval: Bridging Person Re-ID with Internal Attribute Variability CVPR 2024 Investigating Compositional Challenges in Vision-Language Models for Visual Grounding CVPR 2024 Achieving Near-Optimal Convergence for Distributed Minimax Optimization with Adaptive Stepsizes NIPS 2024 Frequency-Enhanced Data Augmentation for Vision-and-Language Navigation NIPS 2023 Context Helps Determine Spatial Knowledge from Tweets IJCNLP 2023 Gender-tuning: Empowering Fine-tuning for Debiasing Pre-trained Language Models ACL 2023 PlanarTrack: A Large-scale Challenging Benchmark for Planar Object Tracking ICCV 2023 Context Helps Determine Spatial Knowledge from Tweets AACL 2023 Bag of Tricks for Training Data Extraction from Language Models ICML 2023 Tackling Data Heterogeneity: A New Unified Framework for Decentralized SGD with Sample-induced Topology ICML 2022 Regularized Graph Structure Learning with Semantic Knowledge for Multi-variates Time-Series Forecasting IJCAI 2022 MACK: Multimodal Aligned Conceptual Knowledge for Unpaired Image-text Matching NIPS 2022 Generalizable Person Re-identification via Self-Supervised Batch Norm Test-Time Adaption AAAI 2022 Clothing Status Awareness for Long-Term Person Re-Identification ICCV 2021 Landmark-RxR: Solving Vision-and-Language Navigation with Fine-Grained Alignment Supervision NIPS 2021 Knowledge-aware Leap-LSTM: Integrating Prior Knowledge into Leap-LSTM towards Faster Long Text Classification AAAI 2021 Rethinking the Heatmap Regression for Bottom-Up Human Pose Estimation CVPR 2021 Rapid Speaker Adaptation for Conformer Transducer: Attention and Bias Are All You Need INTERSPEECH 2021 Learning Goal-oriented Dialogue Policy with opposite Agent Awareness AACL 2020 Unfolding the Alternating Optimization for Blind Super Resolution NIPS 2020 Part-Level Graph Convolutional Network for Skeleton-Based Action Recognition AAAI 2020 Relational Prototypical Network for Weakly Supervised Temporal Action Localization AAAI 2020 Pointing to Select: A Fast Pointer-LSTM for Long Text Classification COLING 2020 Towards Part-aware Monocular 3D Human Pose Estimation: An Architecture Search Approach ECCV 2020 Prediction and Recovery for Adaptive Low-Resolution Person Re-Identification ECCV 2020 Rapid RNN-T Adaptation Using Personalized Speech Synthesis and Neural Language Generator INTERSPEECH 2020 ACMM: Aligned Cross-Modal Memory for Few-Shot Image and Sentence Matching ICCV 2019 Local Relationship Learning With Person-Specific Shape Regularization for Facial Action Unit Detection CVPR 2019 Box-Driven Class-Wise Region Masking and Filling Rate Guided Loss for Weakly Supervised Semantic Segmentation CVPR 2019 Language-Driven Temporal Activity Localization: A Semantic Matching Reinforcement Learning Model CVPR 2019 Few-Shot Image and Sentence Matching via Gated Visual-Semantic Embedding AAAI 2019 SBSGAN: Suppression of Inter-Domain Background Shift for Person Re-Identification ICCV 2019 Cross-Modal Ranking with Soft Consistency and Noisy Labels for Robust RGB-T Tracking ECCV 2018 M3: Multimodal Memory Modelling for Video Captioning CVPR 2018 RetGK: Graph Kernels based on Return Probabilities of Random Walks NIPS 2018 Learning Semantic Concepts and Order for Image and Sentence Matching CVPR 2018 Aligning Infinite-Dimensional Covariance Matrices in Reproducing Kernel Hilbert Spaces for Domain Adaptation CVPR 2018 Mask-Guided Contrastive Attention Model for Person Re-Identification CVPR 2018 Instance-Aware Image and Sentence Matching With Selective Multimodal LSTM CVPR 2017 Improving Mask Learning Based Speech Enhancement System with Restoration Layers and Residual Connection INTERSPEECH 2017 Don’t Count on ASR to Transcribe for You: Breaking Bias with Two Crowds INTERSPEECH 2017 See the Forest for the Trees: Joint Spatial and Temporal Recurrent Neural Networks for Video-Based Person Re-Identification CVPR 2017 Sparse Coding for Classification via Discrimination Ensemble CVPR 2016 Anchoring and Agreement in Syntactic Annotations EMNLP 2016 Semi-Supervised Training in Deep Learning Acoustic Model INTERSPEECH 2016 Conditional High-Order Boltzmann Machine: A Supervised Learning Model for Relation Learning ICCV 2015 Dynamic Texture Recognition via Orthogonal Tensor Dictionary Learning ICCV 2015 Bidirectional Recurrent Convolutional Networks for Multi-Frame Super-Resolution NIPS 2015