Siyuan Feng

25 papers · 2017–2026 · 8 conferences · across top CS/AI conferences

Achievements

+8 more ↓

🌍 Conference Polyglot (8) 🗺️ Taxonomy Completionist (10) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🏃 Academic Marathon (8)

🏃 Academic Marathon (8) 🐝 Cross-Pollinator (12) 🌈 Renaissance Researcher (9) 🏆 Keyword Champion (3) 🗃️ Keyword Collector (119) 💎 Century Club (24) ⚡ Prolific Year (5) 🔥 Unstoppable (9)

Conferences

INTERSPEECH (11) RSS (5) ACL (3) NIPS (2) CORL (1) ICLR (1) ICML (1) OSDI (1)

Top co-authors

Tan Lee (6) Benjamin Burchfiel (5) Shuran Song (5) Eric Cousineau (5) Yuxuan Wang (4) Cheng Chi (4) Odette Scharenborg (3) Rui Xia (3) Zhenjia Xu (3) Ming Tu (3)

Research topics

Robotics (1)

Keywords

automatic speech recognition (6) low-resource language (3) phoneme recognition (3) subword modeling (3) self-supervised learning (3) policy learning (2) multi-task learning (2) diffusion model (2) phonetic representation (2) factorized hierarchical variational autoencoder (2) disentangled representation (2) representation learning (2) imitation learning (2) speech recognition (2) generative adversarial network (2) neural network (2) speech synthesis (1) voice conversion (1) sequence generation (1) unsupervised clustering (1)

Papers

SciPedia: Unlocking the Value of Scientific Data for Pre-training ACL 2026 Unified World Models: Coupling Video and Action Diffusion for Pretraining on Large Robotic Datasets RSS 2025 One Demo is Worth a Thousand Trajectories: Action-View Augmentation for Visuomotor Policies CORL 2025 Understanding the Impact of Confidence in Retrieval Augmented Generation: A Case Study in the Medical Domain ACL 2025 Universal Manipulation Interface: In-The-Wild Robot Teaching Without In-The-Wild Robots RSS 2024 PolyVoice: Language Models for Speech to Speech Translation ICLR 2024 Effectively Scheduling Computational Graphs of Deep Neural Networks toward Their Domain-Specific Accelerators OSDI 2023 Language-Universal Phonetic Representation in Multilingual Speech Pretraining for Low-Resource Speech Recognition INTERSPEECH 2023 Efficient Neural Music Generation NIPS 2023 Diffusion Policy: Visuomotor Policy Learning via Action Diffusion RSS 2023 Language-universal Phonetic Encoder for Low-resource Speech Recognition INTERSPEECH 2023 Tensor Program Optimization with Probabilistic Programs NIPS 2022 Self-supervised Semantic-driven Phoneme Discovery for Zero-resource Speech Recognition ACL 2022 The Effectiveness of Time Stretching for Enhancing Dysarthric Speech for Improved Dysarthric Speech Recognition INTERSPEECH 2022 Iterative Residual Policy for Goal-Conditioned Dynamic Manipulation of Deformable Objects RSS 2022 DextAIRity: Deformable Manipulation Can be a Breeze RSS 2022 Unsupervised Acoustic Unit Discovery by Leveraging a Language-Independent Subword Discriminative Feature Representation INTERSPEECH 2021 Unsupervised Subword Modeling Using Autoregressive Pretraining and Cross-Lingual Phone-Aware Modeling INTERSPEECH 2020 CoT: Cooperative Training for Generative Modeling of Discrete Data ICML 2019 Combining Adversarial Training and Disentangled Speech Representation for Robust Zero-Resource Subword Modeling INTERSPEECH 2019 Improving Unsupervised Subword Modeling via Disentangled Speech Representation Learning and Transformation INTERSPEECH 2019 Exploiting Speaker and Phonetic Diversity of Mismatched Language Resources for Unsupervised Subword Modeling INTERSPEECH 2018 Automatic Speech Assessment for People with Aphasia Using TDNN-BLSTM with Multi-Task Learning INTERSPEECH 2018 Improving Cross-Lingual Knowledge Transferability Using Multilingual TDNN-BLSTM with Language-Dependent Pre-Final Layer INTERSPEECH 2018 On the Linguistic Relevance of Speech Units Learned by Unsupervised Acoustic Modeling INTERSPEECH 2017