conftrace_

Xie Chen

50 papers · 2016–2026 · 8 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓
+13 more ↓ 🧭 Keyword Pioneer πŸ—ΊοΈ Taxonomy Completionist (17) 🌈 Renaissance Researcher (6) πŸŒ‰ Interdisciplinary Bridge 🌍 Conference Polyglot (8)
πŸ—ΊοΈ Taxonomy Completionist (17) 🧭 Keyword Pioneer πŸƒ Academic Marathon (9) 🏠 Conference Loyalist (22) πŸ”¬ Deep Specialist (12) 🧬 Topic Evolution πŸ† Keyword Champion (4) 🀝 Dynamic Duo (22) πŸš€ Conference Pioneer ⚑ Prolific Year (14) πŸ’Ž Century Club (42) πŸ—ƒοΈ Keyword Collector (54) πŸ”₯ Unstoppable (5)

Conferences

INTERSPEECH (22) ACL (13) AAAI (7) EMNLP (3) ICML (2) ICCV (1) IJCAI (1) NAACL (1)

Papers

MeanAudio: Fast and Faithful Text-to-Audio Generation with Mean Flows ACL 2026 Less Languages, Less Tokens: An Efficient Unified Logic Cross-lingual Chain-of-Thought Reasoning Framework ACL 2026 FineLAP: Taming Heterogeneous Supervision for Fine-grained Language-Audio Pretraining ACL 2026 Evaluating the Expressive Appropriateness of Speech in Rich Contexts ACL 2026 Towards Fine-Grained and Multi-Granular Contrastive Language-Speech Pre-training ACL 2026 SAC: Neural Speech Codec with Semantic-Acoustic Dual-Stream Quantization ACL 2026 WaveEx: Accelerating Flow Matching-based Speech Generation via Wavelet-guided Extrapolation AAAI 2026 AHAMask: Reliable Task Specification for Large Audio Language Models Without Instructions AAAI 2026 SimulS2S-LLM: Unlocking Simultaneous Inference of Speech LLMs for Speech-to-Speech Translation ACL 2025 Towards Reliable Large Audio Language Model ACL 2025 SLAM-Omni: Timbre-Controllable Voice Interaction System with Single-Stage Training ACL 2025 Enhancing Speech-to-Speech Dialogue Modeling with End-to-End Retrieval-Augmented Generation EMNLP 2025 URO-Bench: Towards Comprehensive Evaluation for End-to-End Spoken Dialogue Models EMNLP 2025 Bitrate-Controlled Diffusion for Disentangling Motion and Content in Video ICCV 2025 MUZO: Leveraging Multiple Queries and Momentum for Zeroth-Order Fine-Tuning of Large Language Models EMNLP 2025 VQTalker: Towards Multilingual Talking Avatars Through Facial Motion Tokenization AAAI 2025 Language Model Can Listen While Speaking AAAI 2025 Speech Recognition Meets Large Language Model: Benchmarking, Models, and Exploration AAAI 2025 ELLA-V: Stable Neural Codec Language Modeling with Alignment-Guided Sequence Reordering AAAI 2025 GigaSpeech 2: An Evolving, Large-Scale and Multi-domain ASR Corpus for Low-Resource Languages with Automated Crawling, Transcription and Refinement ACL 2025 F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching ACL 2025 Making LLMs Better Many-to-Many Speech-to-Text Translators with Curriculum Learning ACL 2025 emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation ACL 2024 AnoPatch: Towards Better Consistency in Machine Anomalous Sound Detection INTERSPEECH 2024 Incorporating Class-based Language Model for Named Entity Recognition in Factorized Neural Transducer INTERSPEECH 2024 Improved Factorized Neural Transducer Model For Text-only Domain Adaptation INTERSPEECH 2024 EmoBox: Multilingual Multi-corpus Speech Emotion Recognition Toolkit and Benchmark INTERSPEECH 2024 MaLa-ASR: Multimedia-Assisted LLM-Based ASR INTERSPEECH 2024 The Interspeech 2024 Challenge on Speech Processing Using Discrete Units INTERSPEECH 2024 LoRA-Whisper: Parameter-Efficient and Extensible Multilingual ASR INTERSPEECH 2024 On the Effectiveness of Acoustic BPE in Decoder-Only TTS INTERSPEECH 2024 TacoLM: GaTed Attention Equipped Codec Language Model are Efficient Zero-Shot Text to Speech Synthesizers INTERSPEECH 2024 EAT: Self-Supervised Pre-Training with Efficient Audio Transformer IJCAI 2024 BAT: Learning to Reason about Spatial Sounds with Large Language Models ICML 2024 UniCATS: A Unified Context-Aware Text-to-Speech Framework with Contextual VQ-Diffusion and Vocoding AAAI 2024 Towards Effective and Compact Contextual Representation for Conformer Transducer Speech Recognition Systems INTERSPEECH 2023 Unsupervised Active Learning: Optimizing Labeling Cost-Effectiveness for Automatic Speech Recognition INTERSPEECH 2023 Blank-regularized CTC for Frame Skipping in Neural Transducer INTERSPEECH 2023 Improving Code-Switching and Name Entity Recognition in ASR with Speech Editing based Data Augmentation INTERSPEECH 2023 Pushing the Limits of Unsupervised Unit Discovery for SSL Speech Representation INTERSPEECH 2023 MT4SSL: Boosting Self-Supervised Speech Representation Learning by Integrating Multiple Targets INTERSPEECH 2023 DSE-TTS: Dual Speaker Embedding for Cross-Lingual Text-to-Speech INTERSPEECH 2023 VQTTS: High-Fidelity Text-to-Speech Synthesis with Self-Supervised VQ Acoustic Feature INTERSPEECH 2022 Internal Language Model Adaptation with Text-Only Data for End-to-End Speech Recognition INTERSPEECH 2022 Minimum Word Error Rate Training with Language Model Fusion for End-to-End Speech Recognition INTERSPEECH 2021 Improving RNN-T for Domain Scaling Using Semi-Supervised Training with Neural TTS INTERSPEECH 2021 Memory-Efficient Pipeline-Parallel DNN Training ICML 2021 The Effect of Adding Authorship Knowledge in Automated Text Scoring NAACL 2018 Active Memory Networks for Language Modeling INTERSPEECH 2018 Multi-Language Neural Network Language Models INTERSPEECH 2016