Naoki Makishima
16 papers · 2020–2026 · 3 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+8 more ↓ Show less ↑
🏃 Academic Marathon (5) 🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (3) 🧭 Keyword Pioneer 🐝 Cross-Pollinator (5)
🌈
Renaissance Researcher
(6)
🌍
Conference Polyglot
(3)
🤝
Dynamic Duo
(15)
🏆
Keyword Champion
(2)
🔥
Unstoppable
(6)
💎
Century Club
(15)
⚡
Prolific Year
(5)
🗃️
Keyword Collector
(77)
Conferences
INTERSPEECH (13)
AAAI (2)
ICCV (1)
Top co-authors
Research topics
Keywords
automatic speech recognition
(7)
autoregressive modeling
(3)
overlapped speech
(3)
speaker verification
(2)
autoregressive model
(2)
joint modeling
(2)
end-to-end automatic speech recognition
(2)
multi-talker speech recognition
(2)
multi-talker speech
(2)
semi-supervised learning
(2)
model robustness
(1)
speaker embedding
(1)
fine-grained classification
(1)
multi-task learning
(1)
contrastive learning
(1)
adversarial training
(1)
fine-grained recognition
(1)
geometric structure
(1)
multimodal learning
(1)
speaker diarization
(1)
Papers
Difference Vector Equalization for Robust Fine-tuning of Vision-Language Models
AAAI 2026
Multimodal Fine-Grained Apparent Personality Trait Recognition: Joint Modeling of Big Five and Questionnaire Item-level Scores
AAAI 2025
SOMSRED: Sequential Output Modeling for Joint Multi-talker Overlapped Speech Recognition and Speaker Diarization
INTERSPEECH 2024
Unified Multi-Talker ASR with and without Target-speaker Enrollment
INTERSPEECH 2024
End-to-End Joint Target and Non-Target Speakers ASR
INTERSPEECH 2023
Adversarial Finetuning with Latent Representation Constraint to Mitigate Accuracy-Robustness Tradeoff
ICCV 2023
Joint Autoregressive Modeling of End-to-End Multi-Talker Overlapped Speech Recognition and Utterance-level Timestamp Prediction
INTERSPEECH 2023
Strategies to Improve Robustness of Target Speech Extraction to Enrollment Variations
INTERSPEECH 2022
Speaker consistency loss and step-wise optimization for semi-supervised joint training of TTS and ASR using unpaired text data
INTERSPEECH 2022
End-to-End Joint Modeling of Conversation History-Dependent and Independent ASR Systems with Multi-History Training
INTERSPEECH 2022
Enrollment-Less Training for Personalized Voice Activity Detection
INTERSPEECH 2021
End-to-End Rich Transcription-Style Automatic Speech Recognition with Semi-Supervised Learning
INTERSPEECH 2021
Cross-Modal Transformer-Based Neural Correction Models for Automatic Speech Recognition
INTERSPEECH 2021
Unified Autoregressive Modeling for Joint End-to-End Multi-Talker Overlapped Speech Recognition and Speaker Attribute Estimation
INTERSPEECH 2021
Zero-Shot Joint Modeling of Multiple Spoken-Text-Style Conversion Tasks Using Switching Tokens
INTERSPEECH 2021
Phoneme-to-Grapheme Conversion Based Large-Scale Pre-Training for End-to-End Automatic Speech Recognition
INTERSPEECH 2020