conftrace_

Atsushi Ando

18 papers · 2017–2024 · 2 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓
+8 more ↓ 🌍 Conference Polyglot (2) πŸŒ‰ Interdisciplinary Bridge πŸ—ΊοΈ Taxonomy Completionist (10) 🧭 Keyword Pioneer πŸƒ Academic Marathon (7)
🌈 Renaissance Researcher (5) πŸŒ‰ Interdisciplinary Bridge πŸ—ΊοΈ Taxonomy Completionist (10) 🀝 Dynamic Duo (15) πŸ† Keyword Champion (3) ❓ The Questioner πŸ—ƒοΈ Keyword Collector (91) πŸ’Ž Century Club (18)

Conferences

INTERSPEECH (17) ICCV (1)

Research topics

Papers

Unified Multi-Talker ASR with and without Target-speaker Enrollment INTERSPEECH 2024 SpeakerBeam-SS: Real-time Target Speaker Extraction with Lightweight Conv-TasNet and State Space Modeling INTERSPEECH 2024 SOMSRED: Sequential Output Modeling for Joint Multi-talker Overlapped Speech Recognition and Speaker Diarization INTERSPEECH 2024 Factor-Conditioned Speaking-Style Captioning INTERSPEECH 2024 Adversarial Finetuning with Latent Representation Constraint to Mitigate Accuracy-Robustness Tradeoff ICCV 2023 End-to-End Joint Target and Non-Target Speakers ASR INTERSPEECH 2023 Joint Autoregressive Modeling of End-to-End Multi-Talker Overlapped Speech Recognition and Utterance-level Timestamp Prediction INTERSPEECH 2023 Speaker consistency loss and step-wise optimization for semi-supervised joint training of TTS and ASR using unpaired text data INTERSPEECH 2022 End-to-End Joint Modeling of Conversation History-Dependent and Independent ASR Systems with Multi-History Training INTERSPEECH 2022 Interactive Co-Learning with Cross-Modal Transformer for Audio-Visual Emotion Recognition INTERSPEECH 2022 Streaming End-to-End Speech Recognition for Hybrid RNN-T/Attention Architecture INTERSPEECH 2021 Phoneme Duration Modeling Using Speech Rhythm-Based Speaker Embeddings for Multi-Speaker Speech Synthesis INTERSPEECH 2021 Does the Lombard Effect Improve Emotional Communication in Noise? β€” Analysis of Emotional Speech Acted in Noise INTERSPEECH 2019 Improving Conversation-Context Language Models with Multiple Spoken Language Understanding Models INTERSPEECH 2019 Speech Emotion Recognition Based on Multi-Label Emotion Existence Model INTERSPEECH 2019 Automatic Question Detection from Acoustic and Phonetic Features Using Feature-wise Pre-training INTERSPEECH 2018 Role Play Dialogue Aware Language Models Based on Conditional Hierarchical Recurrent Encoder-Decoder INTERSPEECH 2018 Hierarchical LSTMs with Joint Learning for Estimating Customer Satisfaction from Contact Center Calls INTERSPEECH 2017