conftrace_

Samuel Thomas

30 papers · 2013–2025 · 4 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓
+14 more ↓ 🐣 Hot Topic Early Bird πŸ—ΊοΈ Taxonomy Completionist (13) 🧭 Keyword Pioneer πŸŒ‰ Interdisciplinary Bridge 🌍 Conference Polyglot (4)
🌍 Conference Polyglot (4) 🌈 Renaissance Researcher (6) πŸ—ΊοΈ Taxonomy Completionist (13) 🏠 Conference Loyalist (24) 🀝 Dynamic Duo (13) 🧬 Topic Evolution πŸ”¬ Deep Specialist (10) πŸ“ˆ Trend Setter πŸš€ Conference Pioneer πŸ”₯ Unstoppable (11) ⚑ Prolific Year (6) ❓ The Questioner πŸ—ƒοΈ Keyword Collector (139) πŸ’Ž Century Club (30)

Conferences

INTERSPEECH (24) CVPR (3) IJCAI (2) ICCV (1)

Papers

CAV-MAE Sync: Improving Contrastive Audio-Visual Mask Autoencoders via Fine-Grained Alignment CVPR 2025 What When and Where? Self-Supervised Spatio-Temporal Grounding in Untrimmed Multi-Action Videos from Narrated Instructions CVPR 2024 Whisper-Flamingo: Integrating Visual Features into Whisper for Audio-Visual Speech Recognition and Translation INTERSPEECH 2024 ConvKT: Conversation-Level Knowledge Transfer for Context Aware End-to-End Spoken Language Understanding INTERSPEECH 2023 Comparison of Multilingual Self-Supervised and Weakly-Supervised Speech Pre-Training for Adaptation to Unseen Languages INTERSPEECH 2023 Everything at Once - Multi-Modal Fusion Transformer for Video Retrieval CVPR 2022 Extending RNN-T-based speech recognition systems with emotion and language classification INTERSPEECH 2022 Tokenwise Contrastive Pretraining for Finer Speech-to-BERT Alignment in End-to-End Speech-to-Intent Systems INTERSPEECH 2022 Global RNN Transducer Models For Multi-dialect Speech Recognition INTERSPEECH 2022 Speak or Chat with Me: End-to-End Spoken Language Understanding System with Flexible Inputs INTERSPEECH 2021 Multimodal Clustering Networks for Self-Supervised Learning From Unlabeled Videos ICCV 2021 Integrating Dialog History into End-to-End Spoken Language Understanding Systems INTERSPEECH 2021 AVLnet: Learning Audio-Visual Language Representations from Instructional Videos INTERSPEECH 2021 Cascaded Multilingual Audio-Visual Learning from Videos INTERSPEECH 2021 Knowledge Distillation Based Training of Universal ASR Source Models for Cross-Lingual Transfer INTERSPEECH 2021 Implicit Transfer of Privileged Acoustic Information in a Generalized Knowledge Distillation Framework INTERSPEECH 2020 End-to-End Spoken Language Understanding Without Full Transcripts INTERSPEECH 2020 Resource-Adaptive Deep Learning for Visual Speech Recognition INTERSPEECH 2020 Transliteration Based Data Augmentation for Training Multilingual ASR Acoustic Models in Low Resource Settings INTERSPEECH 2020 Detection and Recovery of OOVs for Improved English Broadcast News Captioning INTERSPEECH 2019 Learning Speaker Aware Offsets for Speaker Adaptation of Neural Networks INTERSPEECH 2019 Data Augmentation Improves Recognition of Foreign Accented Speech INTERSPEECH 2018 Inference-Invariant Transformation of Batch Normalization for Domain Adaptation of Acoustic Models INTERSPEECH 2018 English Conversational Telephone Speech Recognition by Humans and Machines INTERSPEECH 2017 Efficient Knowledge Distillation from an Ensemble of Teachers INTERSPEECH 2017 Domain Adaptation of CNN Based Acoustic Models Under Limited Resource Settings INTERSPEECH 2016 An Investigation on the Use of i-Vectors for Robust ASR INTERSPEECH 2016 Multilingual Data Selection for Low Resource Speech Recognition INTERSPEECH 2016 Compiling Constraint Networks into Multivalued Decomposable Decision Graphs IJCAI 2015 Knowledge Compilation for Model Counting: Affine Decision Trees IJCAI 2013