Zhengqi Wen
43 papers · 2016–2026 · 6 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+14 more ↓ Show less ↑
πΊοΈ Taxonomy Completionist (19) π§ Keyword Pioneer π Renaissance Researcher (5) π Interdisciplinary Bridge π£ Hot Topic Early Bird
π
Conference Polyglot
(5)
π
Cross-Pollinator
(9)
πΊοΈ
Taxonomy Completionist
(19)
π
Conference Loyalist
(32)
π€
Dynamic Duo
(35)
π§¬
Topic Evolution
π¬
Deep Specialist
(13)
π
Keyword Champion
(2)
π
Trend Setter
π
Conference Pioneer
π₯
Unstoppable
(6)
β‘
Prolific Year
(7)
π
Century Club
(36)
ποΈ
Keyword Collector
(59)
Conferences
INTERSPEECH (32)
ACL (5)
AAAI (3)
CVPR (1)
EMNLP (1)
IJCAI (1)
Top co-authors
Keywords
speech synthesis
(7)
large language model
(5)
speech recognition
(4)
model compression
(4)
text-to-speech synthesis
(3)
end-to-end model
(3)
speaker representation
(3)
attention mechanism
(3)
deep clustering
(3)
automatic speech recognition
(3)
bidirectional lstm
(2)
recurrent neural network
(2)
one-shot learning
(2)
word embedding
(2)
deep embedding
(2)
gradient boosting decision tree
(2)
mathematical reasoning
(2)
contrastive learning
(2)
voice conversion
(2)
in-context learning
(2)
Papers
SPARK: Strategic Policy-Aware Exploration via Dynamic Branching for Long-Horizon Agentic Learning
ACL 2026
ReFL: Reflective Feedback Learning for Hallucination Detection of Large Language Models
ACL 2026
Calibration-Aware Policy Optimization for Reasoning LLMs
ACL 2026
Two-Stage Regularization-Based Structured Pruning for LLMs
ACL 2026
Beyond Examples: Towards Automated Thought-level In-Context Reasoning for Large Language Models
ACL 2026
AStar: Boosting Multimodal Reasoning with Automated Structured Thinking
AAAI 2026
PSA-MF: Personality-Sentiment Aligned Multi-Level Fusion for Multimodal Sentiment Analysis
AAAI 2026
Code-switching Mediated Sentence-level Semantic Learning
AAAI 2025
M3ANet: Multi-scale and Multi-Modal Alignment Network for Brain-Assisted Target Speaker Extraction
IJCAI 2025
RadialRouter: Structured Representation for Efficient and Robust Large Language Models Routing
EMNLP 2025
ImViD: Immersive Volumetric Videos for Enhanced VR Engagement
CVPR 2025
Generalized Fake Audio Detection via Deep Stable Learning
INTERSPEECH 2024
Residual Speaker Representation for One-Shot Voice Conversion
INTERSPEECH 2024
TraceableSpeech: Towards Proactively Traceable Text-to-Speech with Watermarking
INTERSPEECH 2024
Generalized Source Tracing: Detecting Novel Audio Deepfake Algorithm with Real Emphasis and Fake Dispersion Strategy
INTERSPEECH 2024
Codecfake: An Initial Dataset for Detecting LLM-based Deepfake Audio
INTERSPEECH 2024
PPPR: Portable Plug-in Prompt Refiner for Text to Audio Generation
INTERSPEECH 2024
Genuine-Focused Learning using Mask AutoEncoder for Generalized Fake Audio Detection
INTERSPEECH 2024
End-to-End Spelling Correction Conditioned on Acoustic Feature for Code-Switching Speech Recognition
INTERSPEECH 2021
FSR: Accelerating the Inference Process of Transducer-Based Models by Applying Fast-Skip Regularization
INTERSPEECH 2021
Spike-Triggered Non-Autoregressive Transformer for End-to-End Speech Recognition
INTERSPEECH 2020
Spoken Content and Voice Factorization for Few-Shot Speaker Adaptation
INTERSPEECH 2020
Dynamic Soft Windowing and Language Dependent Style Token for Code-Switching End-to-End Speech Synthesis
INTERSPEECH 2020
Gated Recurrent Fusion of Spatial and Spectral Features for Multi-Channel Speech Separation with Deep Embedding Representations
INTERSPEECH 2020
Listen Attentively, and Spell Once: Whole Sentence Generation via a Non-Autoregressive Architecture for Low-Latency Speech Recognition
INTERSPEECH 2020
Non-Autoregressive End-to-End TTS with Coarse-to-Fine Decoding
INTERSPEECH 2020
Bi-Level Speaker Supervision for One-Shot Speech Synthesis
INTERSPEECH 2020
Joint Training for Simultaneous Speech Denoising and Dereverberation with Deep Embedding Representations
INTERSPEECH 2020
Dynamic Speaker Representations Adjustment and Decoder Factorization for Speaker Adaptation in End-to-End Speech Synthesis
INTERSPEECH 2020
ARVC: An Auto-Regressive Voice Conversion System Without Parallel Training Data
INTERSPEECH 2020
Discriminative Learning for Monaural Speech Separation Using Deep Embedding Features
INTERSPEECH 2019
Self-Attention Transducers for End-to-End Speech Recognition
INTERSPEECH 2019
Learn Spelling from Teachers: Transferring Knowledge from Language Models to Sequence-to-Sequence Speech Recognition
INTERSPEECH 2019
A Time Delay Neural Network with Shared Weight Self-Attention for Small-Footprint Keyword Spotting
INTERSPEECH 2019
Forward-Backward Decoding for Regularizing End-to-End TTS
INTERSPEECH 2019
Deep Metric Learning for the Target Cost in Unit-Selection Speech Synthesizer
INTERSPEECH 2018
On the Application and Compression of Deep Time Delay Neural Network for Embedded Statistical Parametric Speech Synthesis
INTERSPEECH 2018
Transfer Learning Based Progressive Neural Networks for Acoustic Modeling in Statistical Parametric Speech Synthesis
INTERSPEECH 2018
BLSTM-CRF Based End-to-End Prosodic Boundary Prediction with Context Sensitive Embeddings in a Text-to-Speech Front-End
INTERSPEECH 2018
Distilling Knowledge from an Ensemble of Models for Punctuation Prediction
INTERSPEECH 2017
Investigating Efficient Feature Representation Methods and Training Objective for BLSTM-Based Phone Duration Prediction
INTERSPEECH 2017
Improving Prosodic Boundaries Prediction for Mandarin Speech Synthesis by Using Enhanced Embedding Feature and Model Fusion Approach
INTERSPEECH 2016
The Parameterized Phoneme Identity Feature as a Continuous Real-Valued Vector for Neural Network Based Speech Synthesis
INTERSPEECH 2016