Hao Huang
52 papers · 2016–2026 · 15 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+9 more ↓ Show less ↑
π§ Keyword Pioneer π Renaissance Researcher (5) π Interdisciplinary Bridge πΊοΈ Taxonomy Completionist (11) π Conference Polyglot (15)
π
Academic Marathon
(9)
π
Cross-Pollinator
(3)
π
Renaissance Researcher
(5)
π
Conference Pioneer
π
Trend Setter
ποΈ
Keyword Collector
(230)
π₯
Unstoppable
(7)
β‘
Prolific Year
(10)
π
Century Club
(45)
Conferences
INTERSPEECH (15)
AAAI (10)
ACL (6)
CVPR (3)
ECCV (3)
NIPS (3)
EMNLP (2)
ICCV (2)
MICCAI (2)
COLING (1)
ICLR (1)
IJCAI (1)
IJCNLP (1)
NAACL (1)
OSDI (1)
Top co-authors
Keywords
diffusion network
(4)
speech emotion recognition
(3)
end-to-end speech recognition
(3)
network inference
(3)
speech recognition
(3)
spoken language understanding
(3)
slot filling
(2)
black-box attack
(2)
entity tracking
(2)
intent detection
(2)
dialogue state tracking
(2)
transfer learning
(2)
retrieval-augmented generation
(2)
semantic relation
(2)
contrastive learning
(2)
multimodal learning
(2)
data augmentation
(2)
object detection
(2)
semi-supervised learning
(2)
automatic speech recognition
(2)
Papers
MVGD-Net: A Novel Motion-aware Video Glass Surface Detection Method
AAAI 2026
Vision-Only Gaussian Splatting for Collaborative Semantic Occupancy Prediction
AAAI 2026
Better Literary Translation: A Multi-Aspect Data Generation and LLM Training Approach
ACL 2026
Donβt Corrupt the Fact: A Trustworthy RAG Watermarking Framework based on Dual Factual Shield
ACL 2026
LCMA-SRT: Language-Conditional Mixture-of-Experts Adapters for Joint Multilingual Speech Recognition and Translation
ACL 2026
TRACE: Traversal Retrieval-Augmented Chain of Evidence for Document Understanding
ACL 2026
Introducing Visual Scenes and Reasoning: A More Realistic Benchmark for Spoken Language Understanding
AAAI 2026
Asymmetric Matching in Abdominal Lymph Nodes of Follow-up CT Scans
MICCAI 2025
Wavelet Policy: Lifting Scheme for Policy Learning in Long-Horizon Tasks
ICCV 2025
Fast and Synchronous Crash Consistency with Metadata Write-Once File System
OSDI 2025
OpenVIS: Open-vocabulary Video Instance Segmentation
AAAI 2025
INT: Establishing Information Transfer for Multilingual Intent Detection and Slot Filling
ACL 2025
TReND: Transformer derived features and Regularized NMF for neonatal functional network Delineation
MICCAI 2025
GAMap: Zero-Shot Object Goal Navigation with Multi-Scale Geometric-Affordance Guidance
NIPS 2024
$\texttt{dattri}$: A Library for Efficient Data Attribution
NIPS 2024
Benchmarking Complex Instruction-Following with Multiple Constraints Composition
NIPS 2024
QI-IRA: Quantum-Inspired Interactive Ranking Aggregation for Person Re-identification
AAAI 2024
Energy Efficient Streaming Time Series Classification with Attentive Power Iteration
AAAI 2024
Learning Diffusions under Uncertainty
AAAI 2024
FairCLIP: Harnessing Fairness in Vision-Language Learning
CVPR 2024
Interleaving One-Class and Weakly-Supervised Models with Adaptive Thresholding for Unsupervised Video Anomaly Detection
ECCV 2024
FairDomain: Achieving Fairness in Cross-Domain Medical Image Segmentation and Classification
ECCV 2024
YOLOPitch: A Time-Frequency Dual-Branch YOLO Model for Pitch Estimation
INTERSPEECH 2024
Cross-modal Features Interaction-and-Aggregation Network with Self-consistency Training for Speech Emotion Recognition
INTERSPEECH 2024
Synthesizing Long-Form Speech merely from Sentence-Level Corpus with Content Extrapolation and LLM Contextual Enrichment
INTERSPEECH 2024
An Uyghur Extension to the MASSIVE Multi-lingual Spoken Language Understanding Corpus with Comprehensive Evaluations
INTERSPEECH 2024
Scalable-DSC: A Structural Template Prompt Approach to Scalable Dialogue State Correction
EMNLP 2023
T-SEA: Transfer-Based Self-Ensemble Attack on Object Detection
CVPR 2023
Self-supervised Learning Representation based Accent Recognition with Persistent Accent Memory
INTERSPEECH 2023
MTANet: Multi-band Time-frequency Attention Network for Singing Melody Extraction from Polyphonic Music
INTERSPEECH 2023
Efficient Decision-based Black-box Patch Attacks on Video Recognition
ICCV 2023
Fine-Grained Predicates Learning for Scene Graph Generation
CVPR 2022
PM-MMUT: Boosted Phone-mask Data Augmentation using Multi-Modeling Unit Training for Phonetic-Reduction-Robust E2E Speech Recognition
INTERSPEECH 2022
A Multi-grained based Attention Network for Semi-supervised Sound Event Detection
INTERSPEECH 2022
A Graph Isomorphism Network with Weighted Multiple Aggregators for Speech Emotion Recognition
INTERSPEECH 2022
CMUA-Watermark: A Cross-Model Universal Adversarial Watermark for Combating Deepfakes
AAAI 2022
Reconstructing Diffusion Networks from Incomplete Data
IJCAI 2022
Correctable-DST: Mitigating Historical Context Mismatch between Training and Inference for Improved Dialogue State Tracking
EMNLP 2022
Understand before Answer: Improve Temporal Reading Comprehension via Precise Question Understanding
NAACL 2022
Manifold Adversarial Learning for Cross-Domain 3D Shape Representation
ECCV 2022
Adaptive Wavelet Transformer Network for 3D Shape Representation Learning
ICLR 2022
Reasoning over Entity-Action-Location Graph for Procedural Text Understanding
IJCNLP 2021
Leveraging Phone Mask Training for Phonetic-Reduction-Robust E2E Uyghur Speech Recognition
INTERSPEECH 2021
E2E-Based Multi-Task Learning Approach to Joint Speech and Accent Recognition
INTERSPEECH 2021
End-to-End Speech Separation Using Orthogonal Representation in Complex and Real Time-Frequency Domain
INTERSPEECH 2021
Reasoning over Entity-Action-Location Graph for Procedural Text Understanding
ACL 2021
Diffusion Network Inference from Partial Observations
AAAI 2021
RatE: Relation-Adaptive Translating Embedding for Knowledge Graph Completion
COLING 2020
A Lightweight Model Based on Separable Convolution for Speech Emotion Recognition
INTERSPEECH 2020
Monolingual Data Selection Analysis for English-Mandarin Hybrid Code-Switching Speech Recognition
INTERSPEECH 2020
Learning Diffusions without Timestamps
AAAI 2019
Semi-Supervised and Cross-Lingual Knowledge Transfer Learnings for DNN Hybrid Acoustic Models Under Low-Resource Conditions
INTERSPEECH 2016