Yi Ren

65 papers · 2017–2025 · 10 conferences · across top CS/AI conferences

Achievements

+12 more ↓

🗺️ Taxonomy Completionist (18) 🧭 Keyword Pioneer 🌈 Renaissance Researcher (6) 🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (10)

🌍 Conference Polyglot (10) 🌉 Interdisciplinary Bridge 🗺️ Taxonomy Completionist (18) 🔬 Deep Specialist (15) 🤝 Dynamic Duo (35) 👑 Triple Crown 🏆 Grand Slam 🗃️ Keyword Collector (242) ⚡ Prolific Year (9) 📈 Trend Setter 💎 Century Club (65) 🔥 Unstoppable (7)

Conferences

ICLR (15) ACL (11) NIPS (11) AAAI (8) IJCAI (7) ICML (5) INTERSPEECH (5) AISTATS (1) L4DC (1) NAACL (1)

Top co-authors

Zhou Zhao (35) Jinglin Liu (22) Rongjie Huang (14) Xu Tan (12) Xiang Yin (11) Tao Qin (11) Tie-yan Liu (10) Zhenhui Ye (10) Jinzheng He (9) Ziyue Jiang (9)

Research topics

Probability (1)

Keywords

speech synthesis (14) prosody modeling (7) diffusion model (7) text to speech (5) generative adversarial network (4) neural machine translation (4) generative model (3) knowledge distillation (3) variational autoencoder (3) speech generation (3) neural network (3) second-order optimization (2) automatic speech recognition (2) contrastive learning (2) connectionist temporal classification (2) normalizing flow (2) multimodal learning (2) emotion recognition (2) attention mechanism (2) adversarial learning (2)

Papers

Chain-Talker: Chain Understanding and Rendering for Empathetic Conversational Speech Synthesis ACL 2025 Learning Dynamics of LLM Finetuning ICLR 2025 Hacking Task Confounder in Meta-Learning IJCAI 2024 State-Constrained Zero-Sum Differential Games with One-Sided Information ICML 2024 Real3D-Portrait: One-shot Realistic 3D Talking Portrait Synthesis ICLR 2024 Mega-TTS 2: Boosting Prompting Mechanisms for Zero-Shot Speech Synthesis ICLR 2024 Pontryagin neural operator for solving general-sum differential games with parametric state constraints L4DC 2024 lpNTK: Better Generalisation with Less Data via Sample Interaction During Learning ICLR 2024 MimicTalk: Mimicking a personalized and expressive 3D talking face in minutes NIPS 2024 Bias Amplification in Language Model Evolution: An Iterated Learning Perspective NIPS 2024 Emotion Rendering for Conversational Speech Synthesis with Heterogeneous Graph-Based Context Modeling AAAI 2024 AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head AAAI 2024 AMD: Autoregressive Motion Diffusion AAAI 2024 Improving Compositional Generalization using Iterated Learning and Simplicial Embeddings NIPS 2023 Unsupervised Video Domain Adaptation for Action Recognition: A Disentanglement Perspective NIPS 2023 AV-TranSpeech: Audio-Visual Robust Speech-to-Speech Translation ACL 2023 CLAPSpeech: Learning Prosody from Text Context with Contrastive Language-Audio Pre-Training ACL 2023 FastDiff 2: Revisiting and Incorporating GANs and Diffusion Models in High-Fidelity Speech Synthesis ACL 2023 Prosody-TTS: Improving Prosody with Masked Autoencoder and Conditional Diffusion Model For Expressive Text-to-Speech ACL 2023 FluentSpeech: Stutter-Oriented Automatic Speech Editing with Context-Aware Diffusion Models ACL 2023 A Mini-Block Fisher Method for Deep Neural Networks AISTATS 2023 How to prepare your task head for finetuning ICLR 2023 TranSpeech: Speech-to-Speech Translation With Bilateral Perturbation ICLR 2023 Bag of Tricks for Unsupervised Text-to-Speech ICLR 2023 GeneFace: Generalized and High-Fidelity Audio-Driven 3D Talking Face Synthesis ICLR 2023 Make-An-Audio: Text-To-Audio Generation with Prompt-Enhanced Diffusion Models ICML 2023 Attributing Image Generative Models using Latent Fingerprints ICML 2023 FREDIS: A Fusion Framework of Refinement and Disambiguation for Unreliable Partial Label Learning ICML 2023 StyleS2ST: Zero-shot Style Transfer for Direct Speech-to-speech Translation INTERSPEECH 2023 GenerTTS: Pronunciation Disentanglement for Timbre and Style Generalization in Cross-Lingual Text-to-Speech INTERSPEECH 2023 M4Singer: A Multi-Style, Multi-Singer and Musical Score Provided Mandarin Singing Corpus NIPS 2022 Flow-Based Unconstrained Lip to Speech Generation AAAI 2022 Expressivity of Emergent Languages is a Trade-off between Contextual Complexity and Unpredictability ICLR 2022 Parallel and High-Fidelity Text-to-Lip Generation AAAI 2022 A Study of Syntactic Multi-Modality in Non-Autoregressive Machine Translation NAACL 2022 DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism AAAI 2022 Better Supervisory Signals by Observing Learning Paths ICLR 2022 Pseudo Numerical Methods for Diffusion Models on Manifolds ICLR 2022 EditSinger: Zero-Shot Text-Based Singing Voice Editing System with Diverse Prosody Modeling IJCAI 2022 SyntaSpeech: Syntax-Aware Generative Adversarial Text-to-Speech IJCAI 2022 Revisiting Over-Smoothness in Text to Speech ACL 2022 FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis IJCAI 2022 Learning the Beauty in Songs: Neural Singing Voice Beautifier ACL 2022 Dict-TTS: Learning to Pronounce with Prior Dictionary Knowledge for Text-to-Speech NIPS 2022 GenerSpeech: Towards Style Transfer for Generalizable Out-Of-Domain Text-to-Speech NIPS 2022 SongMASS: Automatic Song Writing with Pre-training and Alignment Constraint AAAI 2021 Decentralized Attribution of Generative Models ICLR 2021 FastSpeech 2: Fast and High-Quality End-to-End Text to Speech ICLR 2021 PortaSpeech: Portable and High-Quality Generative Text-to-Speech NIPS 2021 WSRGlow: A Glow-Based Waveform Generative Model for Audio Super-Resolution INTERSPEECH 2021 EMOVIE: A Mandarin Emotion Speech Dataset with a Simple Emotional Text-to-Speech Model INTERSPEECH 2021 UWSpeech: Speech to Speech Translation for Unwritten Languages AAAI 2021 Tensor Normal Training for Deep Learning Models NIPS 2021 FedSpeech: Federated Text-to-Speech with Continual Learning IJCAI 2021 A Study of Non-autoregressive Model for Sequence Generation ACL 2020 SimulSpeech: End-to-End Simultaneous Speech to Text Translation ACL 2020 Deep Blue Sonics’ Submission to IWSLT 2020 Open Domain Translation Task ACL 2020 MultiSpeech: Multi-Speaker Text to Speech with Transformer INTERSPEECH 2020 Compositional languages emerge in a neural iterated learning model ICLR 2020 Task-Level Curriculum Learning for Non-Autoregressive Neural Machine Translation IJCAI 2020 Practical Quasi-Newton Methods for Training Deep Neural Networks NIPS 2020 FastSpeech: Fast, Robust and Controllable Text to Speech NIPS 2019 Almost Unsupervised Text to Speech and Automatic Speech Recognition ICML 2019 Multilingual Neural Machine Translation with Knowledge Distillation ICLR 2019 Sense Beauty by Label Distribution Learning IJCAI 2017