conftrace_

Hung-yi Lee

142 papers · 2016–2026 · 11 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓

+18 more ↓

🗺️ Taxonomy Completionist (31) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (5) 🐣 Hot Topic Early Bird

🌈 Renaissance Researcher (5) 🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (11) 🏠 Conference Loyalist (23) 🌟 Keyword Trendsetter Combo (3) 🤝 Dynamic Duo (17) 🧬 Topic Evolution 🏆 Keyword Champion (2) 🏆 Grand Slam 👥 Mega-Team (76) 🔬 Deep Specialist (26) 📈 Trend Setter 🚀 Conference Pioneer 🔥 Unstoppable (10) ❓ The Questioner (8) 💎 Century Club (137) 🗃️ Keyword Collector (84) ⚡ Prolific Year (32)

Conferences

INTERSPEECH (60) ACL (27) EMNLP (23) IJCNLP (9) NAACL (8) NIPS (4) AAAI (3) ICML (3) EACL (2) ICLR (2) AACL (1)

Top co-authors

Cheng-Han Chiang (18) Shang-Wen Li (15) Yun-Nung Chen (14) Lin-shan Lee (12) Guan-Ting Lin (11) Yung-Sung Chuang (10) Haibin Wu (10) Shinji Watanabe (9) Shu-wen Yang (8) Kai-Wei Chang (6)

Research topics

Resources & Methods (1) Speech & Audio (1) Privacy (1) Education (1)

Keywords

self-supervised learning (20) large language model (19) transfer learning (14) automatic speech recognition (12) speech processing (10) domain adaptation (9) speech recognition (8) generative adversarial network (8) speech synthesis (7) unsupervised learning (7) speaker verification (6) representation learning (6) voice conversion (6) model merging (5) speech representation (5) few-shot learning (5) one-shot learning (4) spoken language understanding (4) question answering (4) model compression (4)

Papers

Full-Duplex-Bench-v2: A Multi-Turn Evaluation Framework for Duplex Dialogue Systems with an Automated Examiner ACL 2026 CodeJudgeBench: Benchmarking LLM-as-a-Judge for Coding Tasks ACL 2026 An Exploration of Mamba for Speech Self-Supervised Models ACL 2026 Shanks: Simultaneous Hearing and Thinking for Spoken Language Models ACL 2026 BILLY: Steering Large Language Models via Merging Persona Vectors for Creative Generation EACL 2026 Hierarchical Speculative Decoding with Dynamic Window NAACL 2025 Safeguard Fine-Tuned LLMs Through Pre- and Post-Tuning Model Merging EMNLP 2025 Gender Bias in Instruction-Guided Speech Synthesis Models NAACL 2025 Generative Audio Language Modeling with Continuous-valued Tokens and Masked Next-Token Prediction ICML 2025 Towards Holistic Evaluation of Large Audio-Language Models: A Comprehensive Survey EMNLP 2025 Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks ICLR 2025 Creativity in LLM-based Multi-Agent Systems: A Survey EMNLP 2025 IMPACT: Iterative Mask-based Parallel Decoding for Text-to-Audio Generation with Diffusion Modeling ICML 2025 TRACT: Regression-Aware Fine-tuning Meets Chain-of-Thought Reasoning for LLM-as-a-Judge ACL 2025 Align-SLM: Textless Spoken Language Models with Reinforcement Learning from AI Feedback ACL 2025 Transferring Textual Preferences to Vision-Language Understanding through Model Merging ACL 2025 InstructionCP: A Simple yet Effective Approach for Transferring Large Language Models to Target Languages ACL 2025 Audio-Aware Large Language Models as Judges for Speaking Styles EMNLP 2025 Merging Facts, Crafting Fallacies: Evaluating the Contradictory Nature of Aggregated Factual Claims in Long-Form Generations ACL 2024 Codec-SUPERB: An In-Depth Analysis of Sound Codec Models ACL 2024 On the Evaluation of Speech Foundation Models for Spoken Language Understanding ACL 2024 Over-Reasoning and Redundant Calculation of Large Language Models EACL 2024 REBORN: Reinforcement-Learned Boundary Segmentation with Iterative Training for Unsupervised ASR NIPS 2024 GSQA: An End-to-End Model for Generative Spoken Question Answering INTERSPEECH 2024 I Need Help! Evaluating LLM’s Ability to Ask for Users’ Support: A Case Study on Text-to-SQL Generation EMNLP 2024 Large Language Model as an Assignment Evaluator: Insights, Feedback, and Challenges in a 1000+ Student Course EMNLP 2024 Task Arithmetic can Mitigate Synthetic-to-Real Gap in Automatic Speech Recognition EMNLP 2024 DogeRM: Equipping Reward Models with Domain Knowledge through Model Merging EMNLP 2024 Continual Test-time Adaptation for End-to-end Speech Recognition on Noisy Speech EMNLP 2024 Let Me Speak Freely? A Study On The Impact Of Format Restrictions On Large Language Model Performance. EMNLP 2024 Can LLMs Understand the Implication of Emphasized Sentences in Dialogue? EMNLP 2024 Do Metadata and Appearance of the Retrieved Webpages Affect LLM’s Reasoning in Retrieval-Augmented Generation? EMNLP 2024 Unveiling Narrative Reasoning Limits of Large Language Models with Trope in Movie Synopses EMNLP 2024 Meta-Diffu$B$: A Contextualized Sequence-to-Sequence Text Diffusion Model with Meta-Exploration NIPS 2024 Exploring In-Context Learning of Textless Speech Language Model for Speech Classification Tasks INTERSPEECH 2024 Understanding Sounds, Missing the Questions: The Challenge of Object Hallucination in Large Audio-Language Models INTERSPEECH 2024 DeSTA: Enhancing Speech Language Models through Descriptive Speech-Text Alignment INTERSPEECH 2024 DAISY: Data Adaptive Self-Supervised Early Exit for Speech Representation Models INTERSPEECH 2024 Emo-bias: A Large Scale Evaluation of Social Bias on Speech Emotion Recognition INTERSPEECH 2024 On the social bias of speech self-supervised models INTERSPEECH 2024 Singing Voice Graph Modeling for SingFake Detection INTERSPEECH 2024 Systematic Analysis for Pretrained Language Model Priming for Parameter-Efficient Fine-tuning NAACL 2024 StreamBench: Towards Benchmarking Continuous Improvement of Language Agents NIPS 2024 Neural Codec-based Adversarial Sample Detection for Speaker Verification INTERSPEECH 2024 ML-SUPERB 2.0: Benchmarking Multilingual Speech Models Across Modeling Constraints, Languages, and Datasets INTERSPEECH 2024 CodecFake: Enhancing Anti-Spoofing Models Against Deepfake Audios from Codec-Based Speech Synthesis Systems INTERSPEECH 2024 Dataset-Distillation Generative Model for Speech Emotion Recognition INTERSPEECH 2024 Parameter-efficient Fine-tuning of Speaker-Aware Dynamic Prompts for Speaker Verification INTERSPEECH 2024 Advancing Large Language Models to Capture Varied Speaking Styles and Respond Properly in Spoken Conversations ACL 2024 Chat Vector: A Simple Approach to Equip LLMs with Instruction Following and Model Alignment in New Languages ACL 2024 How to Estimate Model Transferability of Pre-Trained Speech Models? INTERSPEECH 2023 SLUE Phase-2: A Benchmark Suite of Diverse Spoken Language Understanding Tasks ACL 2023 Introducing Semantics into Speech Encoders ACL 2023 Can Large Language Models Be an Alternative to Human Evaluations? ACL 2023 Are Synonym Substitution Attacks Really Synonym Substitution Attacks? ACL 2023 Position Matters! Empirical Study of Order Effect in Knowledge-grounded Dialogue ACL 2023 Revealing the Blind Spot of Sentence Encoder Evaluation by HEROS ACL 2023 A Closer Look into Using Large Language Models for Automatic Evaluation EMNLP 2023 Hierarchical Programmatic Reinforcement Learning via Learning to Compose Programs ICML 2023 Why We Should Report the Details in Subjective Evaluation of TTS More Rigorously INTERSPEECH 2023 Improving Textless Spoken Language Understanding with Discrete Units as Intermediate Target INTERSPEECH 2023 ML-SUPERB: Multilingual Speech Universal PERformance Benchmark INTERSPEECH 2023 Anticipation-Free Training for Simultaneous Machine Translation ACL 2022 Self-supervised Representation Learning for Speech Processing NAACL 2022 Recent Advances in Pre-trained Language Models: Why Do They Work and How Do They Work IJCNLP 2022 Meta Learning for Natural Language Processing: A Survey NAACL 2022 XDBERT: Distilling Visual Information to BERT from Cross-Modal Systems to Improve Language Understanding ACL 2022 SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark for Semantic and Generative Capabilities ACL 2022 Exploring Continuous Integrate-and-Fire for Adaptive Simultaneous Speech Translation INTERSPEECH 2022 DUAL: Discrete Spoken Unit Adaptive Learning for Textless Spoken Question Answering INTERSPEECH 2022 Membership Inference Attacks Against Self-supervised Speech Models INTERSPEECH 2022 An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks INTERSPEECH 2022 Few Shot Cross-Lingual TTS Using Transferable Phoneme Embedding INTERSPEECH 2022 DDOS: A MOS Prediction Framework utilizing Domain Adaptive Pre-training and Distribution of Opinion Scores INTERSPEECH 2022 Spoofing-Aware Speaker Verification by Multi-Level Fusion INTERSPEECH 2022 Listen, Adapt, Better WER: Source-free Single-utterance Test-time Adaptation for Automatic Speech Recognition INTERSPEECH 2022 Improving Distortion Robustness of Self-supervised Speech Processing Tasks with Domain Adaptation INTERSPEECH 2022 MFA-Conformer: Multi-scale Feature Aggregation Conformer for Automatic Speaker Verification INTERSPEECH 2022 On the Transferability of Pre-trained Language Models: A Study from Artificial Datasets AAAI 2022 Recent Advances in Pre-trained Language Models: Why Do They Work and How Do They Work AACL 2022 AdapterBias: Parameter-efficient Token-dependent Representation Shift for Adapters in NLP Tasks NAACL 2022 Multi-accent Speech Separation with One Shot Learning ACL 2021 S2VC: A Framework for Any-to-Any Voice Conversion with Self-Supervised Pretrained Representations INTERSPEECH 2021 Meta Learning and Its Applications to Natural Language Processing IJCNLP 2021 Investigating the Reordering Capability in CTC-based Non-Autoregressive End-to-End Speech Translation IJCNLP 2021 Multi-accent Speech Separation with One Shot Learning IJCNLP 2021 Mitigating Biases in Toxic Language Detection through Invariant Rationalization IJCNLP 2021 SUPERB: Speech Processing Universal PERformance Benchmark INTERSPEECH 2021 Towards Lifelong Learning of End-to-End ASR INTERSPEECH 2021 Put Chatbot into Its Interlocutor’s Shoes: New Framework to Learn Chatbot Responding with Intention NAACL 2021 Utilizing Self-Supervised Representations for MOS Prediction INTERSPEECH 2021 Stabilizing Label Assignment for Speech Separation by Self-Supervised Pre-Training INTERSPEECH 2021 Mitigating Biases in Toxic Language Detection through Invariant Rationalization ACL 2021 Auto-KWS 2021 Challenge: Task, Datasets, and Baselines INTERSPEECH 2021 Multi-modal User Intent Classification Under the Scenario of Smart Factory (Student Abstract) AAAI 2021 Unsupervised Multiple Choices Question Answering: Start Learning from Basic Knowledge EMNLP 2021 Is BERT a Cross-Disciplinary Knowledge Learner? A Surprising Finding of Pre-trained Models’ Transferability EMNLP 2021 Voting for the Right Answer: Adversarial Defense for Speaker Verification INTERSPEECH 2021 Meta Learning and Its Applications to Natural Language Processing ACL 2021 Investigating the Reordering Capability in CTC-based Non-Autoregressive End-to-End Speech Translation ACL 2021 WG-WaveNet: Real-Time High-Fidelity Speech Synthesis Without GPU INTERSPEECH 2020 Order-Free Learning Alleviating Exposure Bias in Multi-Label Classification AAAI 2020 Worse WER, but Better BLEU? Leveraging Word Embedding as Intermediate in Multitask End-to-End Speech Translation ACL 2020 Pretrained Language Model Embryology: The Birth of ALBERT EMNLP 2020 LAMOL: LAnguage MOdeling for Lifelong Language Learning ICLR 2020 TaylorGAN: Neighbor-Augmented Policy Update Towards Sample-Efficient Natural Language Generation NIPS 2020 DARTS-ASR: Differentiable Architecture Search for Multilingual Speech Recognition and Adaptation INTERSPEECH 2020 Semi-Supervised Learning for Multi-Speaker Text-to-Speech Synthesis Using Discrete Speech Representation INTERSPEECH 2020 Defense for Black-Box Attacks on Anti-Spoofing Models by Self-Supervised Learning INTERSPEECH 2020 Understanding Self-Attention of Self-Supervised Audio Transformers INTERSPEECH 2020 SpeechBERT: An Audio-and-Text Jointly Learned Language Model for End-to-End Spoken Question Answering INTERSPEECH 2020 VQVC+: One-Shot Voice Conversion by Vector Quantization and U-Net Architecture INTERSPEECH 2020 Tree Transformer: Integrating Tree Structures into Self-Attention IJCNLP 2019 Polly Want a Cracker: Analyzing Performance of Parroting on Paraphrase Generation Datasets EMNLP 2019 Zero-shot Reading Comprehension by Cross-lingual Transfer Learning with Multi-lingual Language Representation Model EMNLP 2019 Completely Unsupervised Phoneme Recognition by a Generative Adversarial Network Harmonized with Iteratively Refined Hidden Markov Models INTERSPEECH 2019 Improved Speech Separation with Time-and-Frequency Cross-Domain Joint Embedding and Clustering INTERSPEECH 2019 Unsupervised End-to-End Learning of Discrete Linguistic Units for Voice Conversion INTERSPEECH 2019 Generative Adversarial Networks for Unpaired Voice Transformation on Impaired Speech INTERSPEECH 2019 One-Shot Voice Conversion by Separating Speaker and Content Representations with Instance Normalization INTERSPEECH 2019 Code-Switching Sentence Generation by Generative Adversarial Networks and its Application to Data Augmentation INTERSPEECH 2019 Personalized Dialogue Response Generation Learned from Monologues INTERSPEECH 2019 Tree Transformer: Integrating Tree Structures into Self-Attention EMNLP 2019 DyKgChat: Benchmarking Dialogue Generation Grounding on Dynamic Knowledge Graphs EMNLP 2019 Polly Want a Cracker: Analyzing Performance of Parroting on Paraphrase Generation Datasets IJCNLP 2019 Zero-shot Reading Comprehension by Cross-lingual Transfer Learning with Multi-lingual Language Representation Model IJCNLP 2019 DyKgChat: Benchmarking Dialogue Generation Grounding on Dynamic Knowledge Graphs IJCNLP 2019 Noise Adaptive Speech Enhancement Using Domain Adversarial Training INTERSPEECH 2019 End-to-End Text-to-Speech for Low-Resource Languages by Cross-Lingual Transfer Learning INTERSPEECH 2019 Completely Unsupervised Phoneme Recognition by Adversarially Learning Mapping Relationships from Audio Embeddings INTERSPEECH 2018 Spoken SQuAD: A Study of Mitigating the Impact of Speech Recognition Errors on Listening Comprehension INTERSPEECH 2018 Joint Learning of Interactive Spoken Content Retrieval and Trainable User Simulator INTERSPEECH 2018 Multi-target Voice Conversion without Parallel Data by Adversarially Learning Disentangled Audio Representations INTERSPEECH 2018 Supervised and Unsupervised Transfer Learning for Question Answering NAACL 2018 Learning to Encode Text as Human-Readable Summaries using Generative Adversarial Networks EMNLP 2018 Learning Chinese Word Representations From Glyphs Of Characters EMNLP 2017 Gate Activation Signal Analysis for Gated Recurrent Neural Networks and its Correlation with Phoneme Boundaries INTERSPEECH 2017 Order-Preserving Abstractive Summarization for Spoken Content Based on Connectionist Temporal Classification INTERSPEECH 2017 Audio Word2Vec: Unsupervised Learning of Audio Segment Representations Using Sequence-to-Sequence Autoencoder INTERSPEECH 2016 Interactive Spoken Content Retrieval by Deep Reinforcement Learning INTERSPEECH 2016 Neural Attention Models for Sequence Classification: Analysis and Application to Key Term Extraction and Dialogue Act Detection INTERSPEECH 2016 Towards Machine Comprehension of Spoken Content: Initial TOEFL Listening Comprehension Test by Machine INTERSPEECH 2016