Atsushi Ando
18 papers · 2017–2024 · 2 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+8 more ↓ Show less ↑
π Conference Polyglot (2) π Interdisciplinary Bridge πΊοΈ Taxonomy Completionist (10) π§ Keyword Pioneer π Academic Marathon (7)
π
Renaissance Researcher
(5)
π
Interdisciplinary Bridge
πΊοΈ
Taxonomy Completionist
(10)
π€
Dynamic Duo
(15)
π
Keyword Champion
(3)
β
The Questioner
ποΈ
Keyword Collector
(91)
π
Century Club
(18)
Conferences
INTERSPEECH (17)
ICCV (1)
Top co-authors
Research topics
Keywords
automatic speech recognition
(7)
overlapped speech
(3)
speech recognition
(2)
autoregressive modeling
(2)
autoregressive model
(2)
speaker embedding
(2)
hierarchical encoder-decoder
(2)
recurrent neural network
(2)
multi-talker speech recognition
(2)
multi-talker speech
(2)
acoustic feature
(2)
spoken language understanding
(1)
attention mechanism
(1)
joint learning
(1)
adversarial training
(1)
model robustness
(1)
semi-supervised learning
(1)
multi-label classification
(1)
model complexity
(1)
language modeling
(1)
Papers
Unified Multi-Talker ASR with and without Target-speaker Enrollment
INTERSPEECH 2024
SpeakerBeam-SS: Real-time Target Speaker Extraction with Lightweight Conv-TasNet and State Space Modeling
INTERSPEECH 2024
SOMSRED: Sequential Output Modeling for Joint Multi-talker Overlapped Speech Recognition and Speaker Diarization
INTERSPEECH 2024
Factor-Conditioned Speaking-Style Captioning
INTERSPEECH 2024
Adversarial Finetuning with Latent Representation Constraint to Mitigate Accuracy-Robustness Tradeoff
ICCV 2023
End-to-End Joint Target and Non-Target Speakers ASR
INTERSPEECH 2023
Joint Autoregressive Modeling of End-to-End Multi-Talker Overlapped Speech Recognition and Utterance-level Timestamp Prediction
INTERSPEECH 2023
Speaker consistency loss and step-wise optimization for semi-supervised joint training of TTS and ASR using unpaired text data
INTERSPEECH 2022
End-to-End Joint Modeling of Conversation History-Dependent and Independent ASR Systems with Multi-History Training
INTERSPEECH 2022
Interactive Co-Learning with Cross-Modal Transformer for Audio-Visual Emotion Recognition
INTERSPEECH 2022
Streaming End-to-End Speech Recognition for Hybrid RNN-T/Attention Architecture
INTERSPEECH 2021
Phoneme Duration Modeling Using Speech Rhythm-Based Speaker Embeddings for Multi-Speaker Speech Synthesis
INTERSPEECH 2021
Does the Lombard Effect Improve Emotional Communication in Noise? β Analysis of Emotional Speech Acted in Noise
INTERSPEECH 2019
Improving Conversation-Context Language Models with Multiple Spoken Language Understanding Models
INTERSPEECH 2019
Speech Emotion Recognition Based on Multi-Label Emotion Existence Model
INTERSPEECH 2019
Automatic Question Detection from Acoustic and Phonetic Features Using Feature-wise Pre-training
INTERSPEECH 2018
Role Play Dialogue Aware Language Models Based on Conditional Hierarchical Recurrent Encoder-Decoder
INTERSPEECH 2018
Hierarchical LSTMs with Joint Learning for Estimating Customer Satisfaction from Contact Center Calls
INTERSPEECH 2017