conftrace_

Jun Du

58 papers · 2016–2026 · 12 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓
+14 more ↓ πŸ—ΊοΈ Taxonomy Completionist (21) 🧭 Keyword Pioneer 🌈 Renaissance Researcher (5) πŸŒ‰ Interdisciplinary Bridge 🌍 Conference Polyglot (12)
🐣 Hot Topic Early Bird πŸ—ΊοΈ Taxonomy Completionist (21) 🧭 Keyword Pioneer 🏠 Conference Loyalist (40) 🀝 Dynamic Duo (27) 🧬 Topic Evolution πŸ† Grand Slam πŸ”¬ Deep Specialist (12) πŸ† Keyword Champion (4) πŸ”₯ Unstoppable (10) πŸš€ Conference Pioneer ⚑ Prolific Year (8) πŸ’Ž Century Club (56) πŸ—ƒοΈ Keyword Collector (64)

Conferences

INTERSPEECH (40) AAAI (6) CVPR (2) IJCAI (2) ACL (1) ECCV (1) EMNLP (1) ICCV (1) ICLR (1) ICML (1) MICCAI (1) NIPS (1)

Papers

READ: Real-time and Efficient Asynchronous Diffusion for Audio-driven Talking Head Generation AAAI 2026 Binary-Gaussian: Compact and Progressive Representation for 3D Gaussian Segmentation AAAI 2026 DAWN: Dynamic Frame Avatar with Non-autoregressive Diffusion Framework for Talking head Video Generation ICLR 2025 MISP-Meeting: A Real-World Dataset with Multimodal Cues for Long-form Meeting Transcription and Summarization ACL 2025 QA-MDT: Quality-aware Masked Diffusion Transformer for Enhanced Music Generation IJCAI 2025 EmotiveTalk: Expressive Talking Head Generation through Audio Information Decoupling and Emotional Video Diffusion CVPR 2025 DocMamba: Efficient Document Pre-training with State Space Model AAAI 2025 RFL: Simplifying Chemical Structure Recognition with Ring-Free Language AAAI 2025 Latent Swap Joint Diffusion for 2D Long-Form Latent Generation ICCV 2025 AS-70: A Mandarin stuttered speech dataset for automatic speech recognition and stuttering event detection INTERSPEECH 2024 SRFUND: A Multi-Granularity Hierarchical Structure Reconstruction Benchmark in Form Understanding NIPS 2024 A Study of Dropout-Induced Modality Bias on Robustness to Missing Video Frames for Audio-Visual Speech Recognition CVPR 2024 NAMER: Non-Autoregressive Modeling for Handwritten Mathematical Expression Recognition ECCV 2024 UniTabNet: Bridging Vision and Language Models for Enhanced Table Structure Recognition EMNLP 2024 SEMv3: A Fast and Robust Approach to Table Separation Line Detection IJCAI 2024 Topological GCN for Improving Detection of Hip Landmarks from B-Mode Ultrasound Images MICCAI 2024 Enhancing Voice Wake-Up for Dysarthria: Mandarin Dysarthria Speech Corpus Release and Customized System Design INTERSPEECH 2024 Variance-Preserving-Based Interpolation Diffusion Models for Speech Enhancement INTERSPEECH 2023 HRDoc: Dataset and Baseline Method toward Hierarchical Reconstruction of Document Structures AAAI 2023 AD-TUNING: An Adaptive CHILD-TUNING Approach to Efficient Hyperparameter Optimization of Child Networks for Speech Processing Tasks in the SUPERB Benchmark INTERSPEECH 2023 A Multiple-Teacher Pruning Based Self-Distillation (MT-PSD) Approach to Model Compression for Audio-Visual Wake Word Spotting INTERSPEECH 2023 Unsupervised Adaptation with Quality-Aware Masking to Improve Target-Speaker Voice Activity Detection for Speaker Diarization INTERSPEECH 2023 Online Speaker Diarization with Core Samples Selection INTERSPEECH 2022 Audio-Visual Speech Recognition in MISP2021 Challenge: Dataset Release and Deep Analysis INTERSPEECH 2022 Deep Segment Model for Acoustic Scene Classification INTERSPEECH 2022 External Text Based Data Augmentation for Low-Resource Speech Recognition in the Constrained Condition of OpenASR21 Challenge INTERSPEECH 2022 TDv2: A Novel Tree-Structured Decoder for Offline Mathematical Expression Recognition AAAI 2022 Audio-Visual Wake Word Spotting in MISP2021 Challenge: Dataset Release and Deep Analysis INTERSPEECH 2022 End-to-End Audio-Visual Neural Speaker Diarization INTERSPEECH 2022 Lightweight Causal Transformer with Local Self-Attention for Real-Time Speech Enhancement INTERSPEECH 2021 Automatic Lip-Reading with Hierarchical Pyramidal Convolution and Self-Attention for Image Sequences with No Word Boundaries INTERSPEECH 2021 Scenario-Dependent Speaker Diarization for DIHARD-III Challenge INTERSPEECH 2021 Target-Speaker Voice Activity Detection with Improved i-Vector Estimation for Unknown Number of Speaker INTERSPEECH 2021 The Third DIHARD Diarization Challenge INTERSPEECH 2021 AISHELL-4: An Open Source Dataset for Speech Enhancement, Separation, Recognition and Speaker Diarization in Conference Scenario INTERSPEECH 2021 Audio-Visual Information Fusion Using Cross-Modal Teacher-Student Learning for Voice Activity Detection in Realistic Environments INTERSPEECH 2021 A Maximum Likelihood Approach to SNR-Progressive Learning Using Generalized Gaussian Distribution for LSTM-Based Speech Enhancement INTERSPEECH 2021 A Space-and-Speaker-Aware Iterative Mask Estimation Approach to Multi-Channel Speech Recognition in the CHiME-6 Challenge INTERSPEECH 2020 An Acoustic Segment Model Based Segment Unit Selection Approach to Acoustic Scene Classification with Partial Utterances INTERSPEECH 2020 Adaptive Speaker Normalization for CTC-Based Speech Recognition INTERSPEECH 2020 An Adaptive X-Vector Model for Text-Independent Speaker Verification INTERSPEECH 2020 Using Speech Enhancement Preprocessing for Speech Emotion Recognition in Realistic Noisy Conditions INTERSPEECH 2020 A Noise-Aware Memory-Attention Network Architecture for Regression-Based Speech Enhancement INTERSPEECH 2020 A Tree-Structured Decoder for Image-to-Markup Generation ICML 2020 Unsupervised Regularization-Based Adaptive Training for Speech Recognition INTERSPEECH 2020 Multi-Task Learning with High-Order Statistics for x-Vector Based Text-Independent Speaker Verification INTERSPEECH 2019 Acoustic Model Ensembling Using Effective Data Augmentation for CHiME-5 Challenge INTERSPEECH 2019 KL-Divergence Regularized Deep Neural Network Adaptation for Low-Resource Speaker-Dependent Speech Enhancement INTERSPEECH 2019 A Cross-Entropy-Guided (CEG) Measure for Speech Enhancement Front-End Assessing Performances of Back-End Automatic Speech Recognition INTERSPEECH 2019 A Hybrid Approach to Acoustic Scene Classification Based on Universal Acoustic Models INTERSPEECH 2019 Neural Text Clustering with Document-Level Attention Based on Dynamic Soft Labels INTERSPEECH 2019 The Second DIHARD Diarization Challenge: Dataset, Task, and Baselines INTERSPEECH 2019 Deep Neural Network Embeddings with Gating Mechanisms for Text-Independent Speaker Verification INTERSPEECH 2019 Error Modeling via Asymmetric Laplace Distribution for Deep Neural Network Based Single-Channel Speech Enhancement INTERSPEECH 2018 Speaker Diarization with Enhancing Speech for the First DIHARD Challenge INTERSPEECH 2018 A Maximum Likelihood Approach to Deep Neural Network Based Nonlinear Spectral Mapping for Single-Channel Speech Separation INTERSPEECH 2017 On Design of Robust Deep Models for CHiME-4 Multi-Channel Speech Recognition with Multiple Configurations of Array Microphones INTERSPEECH 2017 SNR-Based Progressive Learning of Deep Neural Network for Speech Enhancement INTERSPEECH 2016