Siqi Zheng

20 papers · 2019–2025 · 6 conferences · across top CS/AI conferences

Achievements

+9 more ↓

🌍 Conference Polyglot (6) 🌉 Interdisciplinary Bridge 🗺️ Taxonomy Completionist (12) 🧭 Keyword Pioneer 🏃 Academic Marathon (6)

🐝 Cross-Pollinator (10) 🌈 Renaissance Researcher (7) 🌉 Interdisciplinary Bridge 🧬 Topic Evolution 🤝 Dynamic Duo (13) ⚡ Prolific Year (6) 🔥 Unstoppable (7) 🗃️ Keyword Collector (100) 💎 Century Club (20)

Conferences

INTERSPEECH (8) ACL (5) ICLR (3) EMNLP (2) AAAI (1) NIPS (1)

Top co-authors

Qian Chen (13) Wen Wang (7) Luyao Cheng (6) Shiliang Zhang (5) Hongbin Suo (5) Zhou Zhao (5) Yafeng Chen (5) Hui Wang (5) Rongjie Huang (4) Zehan Wang (4)

Keywords

speaker verification (7) speaker diarization (4) multi-party meeting (3) neural network (3) speech processing (2) attention pooling (2) feature fusion (2) large language model (2) cross-modal alignment (2) curriculum learning (1) multimodal learning (1) semi-supervised clustering (1) multi-modal learning (1) automatic speech recognition (1) zero-shot learning (1) speech synthesis (1) neural architecture (1) in-context learning (1) semi-supervised learning (1) modality alignment (1)

Papers

Integrating Audio, Visual, and Semantic Information for Enhanced Multimodal Speaker Diarization on Multi-party Conversation ACL 2025 Speech Recognition Meets Large Language Model: Benchmarking, Models, and Exploration AAAI 2025 WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling ICLR 2025 OmniSep: Unified Omni-Modality Sound Separation with Query-Mixup ICLR 2025 ControlSpeech: Towards Simultaneous and Independent Zero-shot Speaker Cloning and Zero-shot Language Style Control ACL 2025 OmniFlatten: An End-to-end GPT Model for Seamless Voice Conversation ACL 2025 ERes2NetV2: Boosting Short-Duration Speaker Verification Performance with Computational Efficiency INTERSPEECH 2024 Extending Multi-modal Contrastive Representations NIPS 2024 CAM++: A Fast and Efficient Network for Speaker Verification Using Context-Aware Masking INTERSPEECH 2023 An Enhanced Res2Net with Local and Global Feature Fusion for Speaker Verification INTERSPEECH 2023 Exploring Speaker-Related Information in Spoken Language Understanding for Better Speaker Diarization ACL 2023 DopplerBAS: Binaural Audio Synthesis Addressing Doppler Effect ACL 2023 Ditto: A Simple and Efficient Approach to Improve Sentence Embeddings EMNLP 2023 Speaker Overlap-aware Neural Diarization for Multi-party Meeting Analysis EMNLP 2022 PoNet: Pooling Network for Efficient Token Mixing in Long Sequences ICLR 2022 PRISM: Pre-trained Indeterminate Speaker Representation Model for Speaker Diarization and Speaker Verification INTERSPEECH 2022 Investigation of Spatial-Acoustic Features for Overlapping Speech Detection in Multiparty Meetings INTERSPEECH 2021 Phonetically-Aware Coupled Network For Short Duration Text-Independent Speaker Verification INTERSPEECH 2020 Towards a Fault-Tolerant Speaker Verification System: A Regularization Approach to Reduce the Condition Number INTERSPEECH 2019 Autoencoder-Based Semi-Supervised Curriculum Learning for Out-of-Domain Speaker Verification INTERSPEECH 2019