conftrace_

Helen Meng

117 papers · 2005–2026 · 10 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓

+15 more ↓

🧭 Keyword Pioneer 🗺️ Taxonomy Completionist (39) 🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (7) 🐣 Hot Topic Early Bird

🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🗺️ Taxonomy Completionist (39) 🏠 Conference Loyalist (91) 🤝 Dynamic Duo (43) 🏆 Keyword Champion (6) 👥 Mega-Team (20) 🔬 Deep Specialist (18) 📈 Trend Setter 🚀 Conference Pioneer ⚡ Prolific Year (5) 🔥 Unstoppable (11) ❓ The Questioner 🗃️ Keyword Collector (130) 💎 Century Club (116)

Conferences

INTERSPEECH (91) ACL (6) EMNLP (6) NAACL (5) AAAI (2) IJCAI (2) NIPS (2) COLING (1) IJCNLP (1) SEMEVAL (1)

Top co-authors

Zhiyong Wu (44) Xunying Liu (40) Xixin Wu (37) Shoukang Hu (17) Mengzhe Geng (14) Xurong Xie (14) Jianwei Yu (13) Shiyin Kang (12) Jia Jia (11) Shansong Liu (11)

Keywords

automatic speech recognition (13) speech recognition (13) text-to-speech synthesis (10) voice conversion (10) speaker adaptation (10) language model (8) speech synthesis (8) large language model (7) recurrent neural network (7) speaker verification (6) speaker embedding (6) domain adaptation (6) data augmentation (5) long short-term memory (5) disordered speech (5) deep neural network (5) dysarthric speech (5) unsupervised learning (4) bayesian inference (4) speech separation (4)

Papers

DualSpeechLM: Towards Unified Speech Understanding and Generation via Dual Speech Token Modeling with Large Language Models AAAI 2026 Large Language Model-based FMRI Encoding of Language Functions for Subjects with Neurocognitive Disorder INTERSPEECH 2024 Towards Effective and Efficient Non-autoregressive Decoding Using Block-based Attention Mask INTERSPEECH 2024 Natural Language Embedded Programs for Hybrid Language Symbolic Reasoning NAACL 2024 Rethinking Machine Ethics – Can LLMs Perform Moral Reasoning through the Lens of Moral Theories? NAACL 2024 SongCreator: Lyrics-based Universal Song Generation NIPS 2024 Self-Alignment for Factuality: Mitigating Hallucinations in LLMs via Self-Evaluation ACL 2024 COKE: A Cognitive Knowledge Graph for Machine Theory of Mind ACL 2024 Seamless Language Expansion: Enhancing Multilingual Mastery in Self-Supervised Models INTERSPEECH 2024 LoRA-MER: Low-Rank Adaptation of Pre-Trained Speech Models for Multimodal Emotion Recognition Using Mutual Information INTERSPEECH 2024 Empowering Whisper as a Joint Multi-Talker and Target-Talker Speech Recognition System INTERSPEECH 2024 SimpleSpeech: Towards Simple and Efficient Text-to-Speech with Scalar Latent Transformer Diffusion Models INTERSPEECH 2024 CoLM-DSR: Leveraging Neural Codec Language Modeling for Multi-Modal Dysarthric Speech Reconstruction INTERSPEECH 2024 UniAudio 1.5: Large Language Model-Driven Audio Codec is A Few-Shot Audio Task Learner NIPS 2024 SimCalib: Graph Neural Network Calibration Based on Similarity between Nodes AAAI 2024 Prompting Large Language Models with Mispronunciation Detection and Diagnosis Abilities INTERSPEECH 2024 Parameter-efficient Fine-tuning of Speaker-Aware Dynamic Prompts for Speaker Verification INTERSPEECH 2024 Spontaneous Style Text-to-Speech Synthesis with Controllable Spontaneous Behaviors Based on Language Models INTERSPEECH 2024 Joint Speaker Features Learning for Audio-visual Multichannel Speech Separation and Recognition INTERSPEECH 2024 Search Augmented Instruction Learning EMNLP 2023 ConvRGX: Recognition, Generation, and Extraction for Self-trained Conversational Question Answering ACL 2023 SGP-TOD: Building Task Bots Effortlessly via Schema-Guided LLM Prompting EMNLP 2023 Text-Only Domain Adaptation for End-to-End Speech Recognition through Down-Sampling Acoustic Representation INTERSPEECH 2023 Integrated and Enhanced Pipeline System to Support Spoken Language Analytics for Screening Neurocognitive Disorders INTERSPEECH 2023 Hyper-parameter Adaptation of Conformer ASR Systems for Elderly and Dysarthric Speech Recognition INTERSPEECH 2023 On-the-Fly Feature Based Rapid Speaker Adaptation for Dysarthric and Elderly Speech Recognition INTERSPEECH 2023 PunCantonese: A Benchmark Corpus for Low-Resource Cantonese Punctuation Restoration from Speech Transcripts INTERSPEECH 2023 Exploiting Cross-Domain And Cross-Lingual Ultrasound Tongue Imaging Features For Elderly And Dysarthric Speech Recognition INTERSPEECH 2023 SememeASR: Boosting Performance of End-to-End Speech Recognition against Domain and Long-Tailed Data Shift with Sememe Semantic Knowledge INTERSPEECH 2023 Towards Spontaneous Style Modeling with Semi-supervised Pre-training for Conversational Text-to-Speech Synthesis INTERSPEECH 2023 Unified Modeling of Multi-Talker Overlapped Speech Recognition and Diarization with a Sidecar Separator INTERSPEECH 2023 Diverse and Expressive Speech Prosody Prediction with Denoising Diffusion Probabilistic Model INTERSPEECH 2023 On Controlling Fallback Responses for Grounded Dialogue Generation ACL 2022 MFA-Conformer: Multi-scale Feature Aggregation Conformer for Automatic Speaker Verification INTERSPEECH 2022 Improving Mandarin Prosodic Structure Prediction with Multi-level Contextual Information INTERSPEECH 2022 Speech Enhancement with Fullband-Subband Cross-Attention Network INTERSPEECH 2022 A Multi-Scale Time-Frequency Spectrogram Discriminator for GAN-based Non-Autoregressive TTS INTERSPEECH 2022 A Multi-Stage Multi-Codebook VQ-VAE Approach to High-Performance Neural TTS INTERSPEECH 2022 Context-aware Multimodal Fusion for Emotion Recognition INTERSPEECH 2022 Towards Green ASR: Lossless 4-bit Quantization of a Hybrid TDNN System on the 300-hr Swithboard Corpus INTERSPEECH 2022 Speech Representation Disentanglement with Adversarial Mutual Information Learning for One-shot Voice Conversion INTERSPEECH 2022 Content-Dependent Fine-Grained Speaker Embedding for Zero-Shot Speaker Adaptation in Text-to-Speech Synthesis INTERSPEECH 2022 Confidence Score Based Conformer Speaker Adaptation for Speech Recognition INTERSPEECH 2022 Two-pass Decoding and Cross-adaptation Based System Combination of End-to-end Conformer and Hybrid TDNN ASR Systems INTERSPEECH 2022 Exploring linguistic feature and model combination for speech recognition based automatic AD detection INTERSPEECH 2022 Towards Improving the Expressiveness of Singing Voice Synthesis with BERT Derived Semantic Information INTERSPEECH 2022 Spoofing-Aware Speaker Verification by Multi-Level Fusion INTERSPEECH 2022 Conformer Based Elderly Speech Recognition System for Alzheimer’s Disease Detection INTERSPEECH 2022 Enhancing Word-Level Semantic Representation via Dependency Structure for Expressive Text-to-Speech Synthesis INTERSPEECH 2022 Towards Multi-Scale Speaking Style Modelling with Hierarchical Context Information for Mandarin Speech Synthesis INTERSPEECH 2022 Towards Cross-speaker Reading Style Transfer on Audiobook Dataset INTERSPEECH 2022 CALM: Constrastive Cross-modal Speaking Style Modeling for Expressive Text-to-Speech Synthesis INTERSPEECH 2022 Unsupervised Multi-scale Expressive Speaking Style Modeling with Hierarchical Context Information for Audiobook Speech Synthesis COLING 2022 Partner Personas Generation for Dialogue Response Generation NAACL 2022 Grounded Dialogue Generation with Cross-encoding Re-ranker, Grounding Span Prediction, and Passage Dropout ACL 2022 Towards Identifying Social Bias in Dialog Systems: Framework, Dataset, and Benchmark EMNLP 2022 COLD: A Benchmark for Chinese Offensive Language Detection EMNLP 2022 Learning Explicit Prosody Models and Deep Speaker Embeddings for Atypical Voice Conversion INTERSPEECH 2021 Bayesian Parametric and Architectural Domain Adaptation of LF-MMI Trained TDNNs for Elderly and Dysarthric Speech Recognition INTERSPEECH 2021 Adversarially Learning Disentangled Speech Representations for Robust Multi-Factor Voice Conversion INTERSPEECH 2021 VQMIVC: Vector Quantization and Mutual Information-Based Unsupervised Speech Representation Disentanglement for One-Shot Voice Conversion INTERSPEECH 2021 Unsupervised Domain Adaptation for Dysarthric Speech Detection via Domain Adversarial Training and Mutual Information Minimization INTERSPEECH 2021 VAENAR-TTS: Variational Auto-Encoder Based Non-AutoRegressive Text-to-Speech Synthesis INTERSPEECH 2021 Transformer Based End-to-End Mispronunciation Detection and Diagnosis INTERSPEECH 2021 Channel-Wise Gated Res2Net: Towards Robust Detection of Synthetic Speech Attacks INTERSPEECH 2021 Towards Multi-Scale Style Control for Expressive Speech Synthesis INTERSPEECH 2021 Spectro-Temporal Deep Features for Disordered Speech Assessment and Recognition INTERSPEECH 2021 Adversarial Data Augmentation for Disordered Speech Recognition INTERSPEECH 2021 Speech-XLNet: Unsupervised Acoustic Model Pretraining for Self-Attention Networks INTERSPEECH 2020 Transferring Source Style in Non-Parallel Voice Conversion INTERSPEECH 2020 Group Gated Fusion on Attention-Based Bidirectional Alignment for Multimodal Emotion Recognition INTERSPEECH 2020 SpecSwap: A Simple Data Augmentation Method for End-to-End Speech Recognition INTERSPEECH 2020 Investigation of Data Augmentation Techniques for Disordered Speech Recognition INTERSPEECH 2020 Exploiting Cross-Domain Visual Feature Generation for Disordered Speech Recognition INTERSPEECH 2020 Investigating Robustness of Adversarial Samples Detection for Automatic Speaker Verification INTERSPEECH 2020 Re-Weighted Interval Loss for Handling Data Imbalance Problem of End-to-End Keyword Spotting INTERSPEECH 2020 Speaker-Aware Linear Discriminant Analysis in Speaker Verification INTERSPEECH 2020 Enhancing Monotonicity for Robust Autoregressive Transformer TTS INTERSPEECH 2020 Audio-Visual Multi-Channel Recognition of Overlapped Speech INTERSPEECH 2020 The CUHK Dysarthric Speech Recognition Systems for English and Cantonese INTERSPEECH 2019 Exploiting Visual Features Using Bayesian Gated Neural Networks for Disordered Speech Recognition INTERSPEECH 2019 On the Use of Pitch Features for Disordered Speech Recognition INTERSPEECH 2019 Towards Discriminative Representation Learning for Speech Emotion Recognition IJCAI 2019 Knowledge-Based Linguistic Encoding for End-to-End Mandarin Text-to-Speech Synthesis INTERSPEECH 2019 One-Shot Voice Conversion with Global Speaker Embeddings INTERSPEECH 2019 Jointly Trained Conversion Model and WaveNet Vocoder for Non-Parallel Voice Conversion Using Mel-Spectrograms and Phonetic Posteriorgrams INTERSPEECH 2019 Disambiguation of Chinese Polyphones in an End-to-End Framework with Semantic Features Extracted by Pre-Trained BERT INTERSPEECH 2019 Extract, Adapt and Recognize: An End-to-End Neural Network for Corrupted Monaural Speech Recognition INTERSPEECH 2019 LF-MMI Training of Bayesian and Gaussian Process Time Delay Neural Networks for Speech Recognition INTERSPEECH 2019 Unsupervised Methods for Audio Classification from Lecture Discussion Recordings INTERSPEECH 2019 Comparative Study of Parametric and Representation Uncertainty Modeling for Recurrent Neural Network Language Models INTERSPEECH 2019 Siamese Recurrent Auto-Encoder Representation for Query-by-Example Spoken Term Detection INTERSPEECH 2018 Emotion Recognition from Variable-Length Speech Segments Using Deep Learning on Spectrograms INTERSPEECH 2018 Development of the CUHK Dysarthric Speech Recognition System for the UA Speech Corpus INTERSPEECH 2018 Speech and Language Processing for Learning and Wellbeing INTERSPEECH 2018 Voice Conversion Across Arbitrary Speakers Based on a Single Target-Speaker Utterance INTERSPEECH 2018 Gaussian Process Neural Networks for Speech Recognition INTERSPEECH 2018 Rapid Style Adaptation Using Residual Error Embedding for Expressive Speech Synthesis INTERSPEECH 2018 Unsupervised Discovery of Non-native Phonetic Patterns in L2 English Speech for Mispronunciation Detection and Diagnosis INTERSPEECH 2018 Detection of Glottal Closure Instants from Speech Signals: A Convolutional Neural Network Based Method INTERSPEECH 2018 Speech Emotion Recognition with Emotion-Pair Based Framework Considering Emotion Distribution Information in Dimensional Emotion Space INTERSPEECH 2017 DNN i-Vector Speaker Verification with Short, Text-Constrained Test Utterances INTERSPEECH 2017 Spectro-Temporal Modelling with Time-Frequency LSTM and Structured Output Layer for Voice Conversion INTERSPEECH 2017 Multi-Task Learning for Prosodic Structure Generation Using BLSTM RNN with Structured Output Layer INTERSPEECH 2017 Personalized, Cross-Lingual TTS Using Phonetic Posteriorgrams INTERSPEECH 2016 Phoneme Embedding and its Application to Speech Driven Talking Avatar Synthesis INTERSPEECH 2016 Expressive Speech Driven Talking Avatar Synthesis with DBLSTM Using Limited Amount of Emotional Bimodal Data INTERSPEECH 2016 Combining CNN and BLSTM to Extract Textual and Acoustic Features for Recognizing Stances in Mandarin Ideological Debate Competition INTERSPEECH 2016 Analysis on Gated Recurrent Unit Based Question Detection Approach INTERSPEECH 2016 Fine-grained Opinion Mining with Recurrent Neural Networks and Word Embeddings EMNLP 2015 Modelling High-Dimensional Sequences with LSTM-RTRBM: Application to Polyphonic Music Generation IJCAI 2015 SeemGo: Conditional Random Fields Labeling and Maximum Entropy Classification for Aspect Based Sentiment Analysis SEMEVAL 2014 Automatic Story Segmentation using a Bayesian Decision Framework for Statistical Models of Lexical Chain Features IJCNLP 2009 Automatic Story Segmentation using a Bayesian Decision Framework for Statistical Models of Lexical Chain Features ACL 2009 Combined Use of Speaker- and Tone-Normalized Pitch Reset with Pause Duration for Automatic Story Segmentation in Mandarin Broadcast News NAACL 2007 A Maximum Entropy Framework that Integrates Word Dependencies and Grammatical Relations for Reading Comprehension NAACL 2006 The Use of Metadata, Web-derived Answer Patterns and Passage Context to Improve Reading Comprehension Performance EMNLP 2005