conftrace_

Jinyu Li

59 papers · 2016–2026 · 7 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓
+15 more ↓ 🧭 Keyword Pioneer πŸŒ‰ Interdisciplinary Bridge πŸ—ΊοΈ Taxonomy Completionist (26) 🌈 Renaissance Researcher (5) 🐣 Hot Topic Early Bird
🐝 Cross-Pollinator (13) πŸŒ‰ Interdisciplinary Bridge πŸ—ΊοΈ Taxonomy Completionist (26) 🏠 Conference Loyalist (46) πŸ† Keyword Champion (2) πŸ”¬ Deep Specialist (10) 🧬 Topic Evolution 🀝 Dynamic Duo (21) πŸ—ƒοΈ Keyword Collector (90) πŸ’Ž Century Club (57) πŸ“ˆ Trend Setter πŸš€ Conference Pioneer ⚑ Prolific Year (9) πŸ”₯ Unstoppable (10) ❓ The Questioner

Conferences

INTERSPEECH (46) ACL (6) EMNLP (2) NIPS (2) CVPR (1) ICLR (1) ICML (1)

Papers

Closing the Modality Reasoning Gap for Speech Large Language Models ACL 2026 SpeechLLM-as-Judges: Towards General and Interpretable Speech Quality Evaluation ACL 2026 ARLON: Boosting Diffusion Transformers with Autoregressive Models for Long Video Generation ICLR 2025 Autoregressive Speech Synthesis without Vector Quantization ACL 2025 SLAM-Omni: Timbre-Controllable Voice Interaction System with Single-Stage Training ACL 2025 Failing Forward: Improving Generative Error Correction for ASR with Synthetic Data and Retrieval Augmentation ACL 2025 NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models ICML 2024 COSMIC: Data Efficient Instruction-tuning For Speech In-Context Learning INTERSPEECH 2024 An Investigation of Noise Robustness for Flow-Matching-Based Zero-Shot TTS INTERSPEECH 2024 Total-Duration-Aware Duration Modeling for Text-to-Speech Systems INTERSPEECH 2024 A multimodal approach to study the nature of coordinative patterns underlying speech rhythm INTERSPEECH 2024 Soft Language Identification for Language-Agnostic Many-to-One End-to-End Speech Translation INTERSPEECH 2024 CoVoMix: Advancing Zero-Shot Speech Generation for Human-like Multi-talker Conversations NIPS 2024 TransVIP: Speech to Speech Translation System with Voice and Isochrony Preservation NIPS 2024 WavLLM: Towards Robust and Adaptive Speech Large Language Model EMNLP 2024 Accurate and Structured Pruning for Efficient Automatic Speech Recognition INTERSPEECH 2023 LAMASSU: A Streaming Language-Agnostic Multilingual Speech Recognition and Translation Model Using Neural Transducers INTERSPEECH 2023 PillarNeXt: Rethinking Network Designs for 3D Object Detection in LiDAR Point Clouds CVPR 2023 Accelerating Transducers through Adjacent Token Merging INTERSPEECH 2023 SpeechUT: Bridging Speech and Text with Hidden-Unit for Encoder-Decoder Based Speech-Text Pre-training EMNLP 2022 SpeechT5: Unified-Modal Encoder-Decoder Pre-Training for Spoken Language Processing ACL 2022 Streaming Speaker-Attributed ASR with Token-Level Speaker Embeddings INTERSPEECH 2022 Internal Language Model Adaptation with Text-Only Data for End-to-End Speech Recognition INTERSPEECH 2022 Supervision-Guided Codebooks for Masked Prediction in Speech Pre-training INTERSPEECH 2022 Pre-Training Transformer Decoder for End-to-End ASR Model with Unpaired Speech Data INTERSPEECH 2022 Large-Scale Streaming End-to-End Speech Translation with Neural Transducers INTERSPEECH 2022 Why does Self-Supervised Learning for Speech Recognition Benefit Speaker Recognition? INTERSPEECH 2022 Streaming Multi-Talker ASR with Token-Level Serialized Output Training INTERSPEECH 2022 Separating Long-Form Speech with Group-wise Permutation Invariant Training INTERSPEECH 2022 On Minimum Word Error Rate Training of the Hybrid Autoregressive Transducer INTERSPEECH 2021 Improving Multilingual Transformer Transducer Models by Reducing Language Confusions INTERSPEECH 2021 Improving RNN-T for Domain Scaling Using Semi-Supervised Training with Neural TTS INTERSPEECH 2021 Rapid Speaker Adaptation for Conformer Transducer: Attention and Bias Are All You Need INTERSPEECH 2021 Multiple Softmax Architecture for Streaming Multilingual End-to-End ASR Systems INTERSPEECH 2021 Streaming Multi-Talker Speech Recognition with Joint Speaker Identification INTERSPEECH 2021 A Light-Weight Contextual Spelling Correction Model for Customizing Transducer-Based Speech Recognition Systems INTERSPEECH 2021 Minimum Word Error Rate Training with Language Model Fusion for End-to-End Speech Recognition INTERSPEECH 2021 Ultra Fast Speech Separation Model with Teacher Student Learning INTERSPEECH 2021 Investigation of Practical Aspects of Single Channel Speech Separation for ASR INTERSPEECH 2021 Sequence-Level Self-Learning with Multiple Hypotheses INTERSPEECH 2020 Exploring Transformers for Large-Scale Speech Recognition INTERSPEECH 2020 On the Comparison of Popular End-to-End Models for Large Scale Speech Recognition INTERSPEECH 2020 An End-to-End Architecture of Online Multi-Channel Speech Separation INTERSPEECH 2020 Semantic Mask for Transformer Based End-to-End Speech Recognition INTERSPEECH 2020 Rapid RNN-T Adaptation Using Personalized Speech Synthesis and Neural Language Generator INTERSPEECH 2020 Combination of End-to-End and Hybrid Models for Speech Recognition INTERSPEECH 2020 Low Latency End-to-End Streaming Speech Recognition with a Scout Network INTERSPEECH 2020 Transfer Learning Approaches for Streaming End-to-End Speech Recognition System INTERSPEECH 2020 Developing RNN-T Models Surpassing High-Performance Hybrid Models with Customization Capability INTERSPEECH 2020 Acoustic-to-Phrase Models for Speech Recognition INTERSPEECH 2019 Layer Trajectory BLSTM INTERSPEECH 2019 Speaker Adaptation for Attention-Based End-to-End Speech Recognition INTERSPEECH 2019 Adversarial Feature-Mapping for Speech Enhancement INTERSPEECH 2018 Improved Training for Online End-to-end Speech Recognition Systems INTERSPEECH 2018 Layer Trajectory LSTM INTERSPEECH 2018 Cycle-Consistent Speech Enhancement INTERSPEECH 2018 Improving Mask Learning Based Speech Enhancement System with Restoration Layers and Residual Connection INTERSPEECH 2017 Large-Scale Domain Adaptation via Teacher-Student Learning INTERSPEECH 2017 Deep Convolutional Neural Networks with Layer-Wise Context Expansion and Attention INTERSPEECH 2016