conftrace_

Jianhua Tao

82 papers · 2016–2026 · 9 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓

+14 more ↓

🧭 Keyword Pioneer 🗺️ Taxonomy Completionist (36) 🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (6) 🐣 Hot Topic Early Bird

🌉 Interdisciplinary Bridge 🐣 Hot Topic Early Bird 🗺️ Taxonomy Completionist (36) 🏠 Conference Loyalist (58) 🤝 Dynamic Duo (35) 🔬 Deep Specialist (15) 🏆 Keyword Champion (2) 📈 Trend Setter 🚀 Conference Pioneer ⚡ Prolific Year (14) 🔥 Unstoppable (10) ❓ The Questioner 🗃️ Keyword Collector (107) 💎 Century Club (76)

Conferences

INTERSPEECH (58) AAAI (8) ACL (6) ICML (3) NIPS (3) COLING (1) CVPR (1) EMNLP (1) NAACL (1)

Top co-authors

Zhengqi Wen (41) Jiangyan Yi (32) Bin Liu (21) Ruibo Fu (17) SHUAI ZHANG (16) Zheng Lian (15) Cunhang Fan (14) Ye Bai (11) Zhengkun Tian (11) Tao Wang (9)

Keywords

speech synthesis (8) attention mechanism (7) fake audio detection (7) audio deepfake detection (6) model compression (6) speech recognition (5) speech emotion recognition (5) large language model (5) continual learning (4) representation learning (4) automatic speech recognition (4) catastrophic forgetting (4) knowledge distillation (4) end-to-end model (3) end-to-end speech recognition (3) text-to-speech synthesis (3) multimodal sentiment analysis (3) bidirectional lstm (3) deep learning (3) deep clustering (3)

Papers

Beyond Examples: Towards Automated Thought-level In-Context Reasoning for Large Language Models ACL 2026 AStar: Boosting Multimodal Reasoning with Automated Structured Thinking AAAI 2026 SPARK: Strategic Policy-Aware Exploration via Dynamic Branching for Long-Horizon Agentic Learning ACL 2026 PSA-MF: Personality-Sentiment Aligned Multi-Level Fusion for Multimodal Sentiment Analysis AAAI 2026 ReFL: Reflective Feedback Learning for Hallucination Detection of Large Language Models ACL 2026 Two-Stage Regularization-Based Structured Pruning for LLMs ACL 2026 ImViD: Immersive Volumetric Videos for Enhanced VR Engagement CVPR 2025 RadialRouter: Structured Representation for Efficient and Robust Large Language Models Routing EMNLP 2025 AffectGPT: A New Dataset, Model, and Benchmark for Emotion Understanding with Multimodal Large Language Models ICML 2025 Region-Based Optimization in Continual Learning for Audio Deepfake Detection AAAI 2025 OV-MER: Towards Open-Vocabulary Multimodal Emotion Recognition ICML 2025 BSDB-Net: Band-Split Dual-Branch Network with Selective State Spaces Mechanism for Monaural Speech Enhancement AAAI 2025 Code-switching Mediated Sentence-level Semantic Learning AAAI 2025 Pandora’s Box or Aladdin’s Lamp: A Comprehensive Analysis Revealing the Role of RAG Noise in Large Language Models ACL 2025 Listen, Watch, and Learn to Feel: Retrieval-Augmented Emotion Reasoning for Compound Emotion Generation ACL 2025 Residual Speaker Representation for One-Shot Voice Conversion INTERSPEECH 2024 TraceableSpeech: Towards Proactively Traceable Text-to-Speech with Watermarking INTERSPEECH 2024 Codecfake: An Initial Dataset for Detecting LLM-based Deepfake Audio INTERSPEECH 2024 Progressive Distillation Based on Masked Generation Feature Method for Knowledge Graph Completion AAAI 2024 What to Remember: Self-Adaptive Continual Learning for Audio Deepfake Detection AAAI 2024 NLoPT: N-gram Enhanced Low-Rank Task Adaptive Pre-training for Efficient Language Model Adaption COLING 2024 DARNet: Dual Attention Refinement Network with Spatiotemporal Construction for Auditory Attention Detection NIPS 2024 Bilateral Masking with prompt for Knowledge Graph Completion NAACL 2024 PPPR: Portable Plug-in Prompt Refiner for Text to Audio Generation INTERSPEECH 2024 Genuine-Focused Learning using Mask AutoEncoder for Generalized Fake Audio Detection INTERSPEECH 2024 Generalized Source Tracing: Detecting Novel Audio Deepfake Algorithm with Real Emphasis and Fake Dispersion Strategy INTERSPEECH 2024 Generalized Fake Audio Detection via Deep Stable Learning INTERSPEECH 2024 Prompt Link Multimodal Fusion in Multimodal Sentiment Analysis INTERSPEECH 2024 RawBMamba: End-to-End Bidirectional State Space Model for Audio Deepfake Detection INTERSPEECH 2024 VRA: Variational Rectified Activation for Out-of-distribution Detection NIPS 2023 ALIM: Adjusting Label Importance Mechanism for Noisy Partial Label Learning NIPS 2023 Do You Remember? Overcoming Catastrophic Forgetting for Fake Audio Detection ICML 2023 SOT: Self-supervised Learning-Assisted Optimal Transport for Unsupervised Adaptive Speech Emotion Recognition INTERSPEECH 2023 TO-Rawnet: Improving RawNet with TCN and Orthogonal Regularization for Fake Audio Detection INTERSPEECH 2023 EmotionNAS: Two-stream Neural Architecture Search for Speech Emotion Recognition INTERSPEECH 2023 Detection of Cross-Dataset Fake Audio Based on Prosodic and Pronunciation Features INTERSPEECH 2023 reducing multilingual context confusion for end-to-end code-switching automatic speech recognition INTERSPEECH 2022 Speaker recognition-assisted robust audio deepfake detection INTERSPEECH 2022 Continual Learning for Fake Audio Detection INTERSPEECH 2021 Half-Truth: A Partially Fake Audio Detection Dataset INTERSPEECH 2021 TDCA-Net: Time-Domain Channel Attention Network for Depression Detection INTERSPEECH 2021 End-to-End Spelling Correction Conditioned on Acoustic Feature for Code-Switching Speech Recognition INTERSPEECH 2021 FSR: Accelerating the Inference Process of Transducer-Based Models by Applying Fast-Skip Regularization INTERSPEECH 2021 Listen Attentively, and Spell Once: Whole Sentence Generation via a Non-Autoregressive Architecture for Low-Latency Speech Recognition INTERSPEECH 2020 Bi-Level Speaker Supervision for One-Shot Speech Synthesis INTERSPEECH 2020 Learning Utterance-Level Representations with Label Smoothing for Speech Emotion Recognition INTERSPEECH 2020 Comparison of Glottal Source Parameter Values in Emotional Vowels INTERSPEECH 2020 Joint Training for Simultaneous Speech Denoising and Dereverberation with Deep Embedding Representations INTERSPEECH 2020 Dynamic Speaker Representations Adjustment and Decoder Factorization for Speaker Adaptation in End-to-End Speech Synthesis INTERSPEECH 2020 ARVC: An Auto-Regressive Voice Conversion System Without Parallel Training Data INTERSPEECH 2020 Hybrid Network Feature Extraction for Depression Assessment from Speech INTERSPEECH 2020 Spike-Triggered Non-Autoregressive Transformer for End-to-End Speech Recognition INTERSPEECH 2020 Non-Autoregressive End-to-End TTS with Coarse-to-Fine Decoding INTERSPEECH 2020 Context-Dependent Domain Adversarial Neural Network for Multimodal Emotion Recognition INTERSPEECH 2020 Focal Loss for Punctuation Prediction INTERSPEECH 2020 Spoken Content and Voice Factorization for Few-Shot Speaker Adaptation INTERSPEECH 2020 Conversational Emotion Recognition Using Self-Attention Mechanisms and Graph Neural Networks INTERSPEECH 2020 Dynamic Soft Windowing and Language Dependent Style Token for Code-Switching End-to-End Speech Synthesis INTERSPEECH 2020 Gated Recurrent Fusion of Spatial and Spectral Features for Multi-Channel Speech Separation with Deep Embedding Representations INTERSPEECH 2020 ParamE: Regarding Neural Network Parameters as Relation Embeddings for Knowledge Graph Completion AAAI 2020 Discriminative Learning for Monaural Speech Separation Using Deep Embedding Features INTERSPEECH 2019 Automatic Depression Level Detection via ℓp-Norm Pooling INTERSPEECH 2019 Unsupervised Representation Learning with Future Observation Prediction for Speech Emotion Recognition INTERSPEECH 2019 Forward-Backward Decoding for Regularizing End-to-End TTS INTERSPEECH 2019 Conversational Emotion Analysis via Attention Mechanisms INTERSPEECH 2019 A Time Delay Neural Network with Shared Weight Self-Attention for Small-Footprint Keyword Spotting INTERSPEECH 2019 Learn Spelling from Teachers: Transferring Knowledge from Language Models to Sequence-to-Sequence Speech Recognition INTERSPEECH 2019 Self-Attention Transducers for End-to-End Speech Recognition INTERSPEECH 2019 Deep Noise Tracking Network: A Hybrid Signal Processing/Deep Learning Approach to Speech Enhancement INTERSPEECH 2018 Deep Metric Learning for the Target Cost in Unit-Selection Speech Synthesizer INTERSPEECH 2018 On the Application and Compression of Deep Time Delay Neural Network for Embedded Statistical Parametric Speech Synthesis INTERSPEECH 2018 Transfer Learning Based Progressive Neural Networks for Acoustic Modeling in Statistical Parametric Speech Synthesis INTERSPEECH 2018 Sparsity-Constrained Weight Mapping for Head-Related Transfer Functions Individualization from Anthropometric Features INTERSPEECH 2018 BLSTM-CRF Based End-to-End Prosodic Boundary Prediction with Context Sensitive Embeddings in a Text-to-Speech Front-End INTERSPEECH 2018 Speech Emotion Recognition from Variable-Length Inputs with Triplet Loss Function INTERSPEECH 2018 Investigating Efficient Feature Representation Methods and Training Objective for BLSTM-Based Phone Duration Prediction INTERSPEECH 2017 Distilling Knowledge from an Ensemble of Models for Punctuation Prediction INTERSPEECH 2017 A Domain Knowledge-Assisted Nonlinear Model for Head-Related Transfer Functions Based on Bottleneck Deep Neural Network INTERSPEECH 2017 A Novel Research to Artificial Bandwidth Extension Based on Deep BLSTM Recurrent Neural Networks and Exemplar-Based Sparse Representation INTERSPEECH 2016 A Sparse Spherical Harmonic-Based Model in Subbands for Head-Related Transfer Functions INTERSPEECH 2016 Improving Prosodic Boundaries Prediction for Mandarin Speech Synthesis by Using Enhanced Embedding Feature and Model Fusion Approach INTERSPEECH 2016 The Parameterized Phoneme Identity Feature as a Continuous Real-Valued Vector for Neural Network Based Speech Synthesis INTERSPEECH 2016