Mark Hasegawa-Johnson
61 papers · 2012–2025 · 11 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+13 more ↓ Show less ↑
π§ Keyword Pioneer π£ Hot Topic Early Bird π Interdisciplinary Bridge πΊοΈ Taxonomy Completionist (19) π Conference Polyglot (11)
π
Cross-Pollinator
(13)
π
Renaissance Researcher
(11)
π§
Keyword Pioneer
π
Conference Loyalist
(37)
π₯
Mega-Team
(20)
π¬
Deep Specialist
(16)
ποΈ
Keyword Collector
(251)
β‘
Prolific Year
(9)
π
Conference Pioneer
π
Trend Setter
π
Century Club
(61)
π₯
Unstoppable
(10)
β
The Questioner
Conferences
INTERSPEECH (37)
ACL (6)
ICML (5)
COLING (3)
EMNLP (3)
NAACL (2)
AAAI (1)
AISTATS (1)
CVPR (1)
ICCV (1)
WACV (1)
Top co-authors
Research topics
Keywords
speech recognition
(12)
automatic speech recognition
(11)
transfer learning
(8)
unsupervised learning
(7)
multi-task learning
(6)
deep neural network
(5)
neural network
(4)
cross-lingual transfer
(4)
zero-shot learning
(4)
probabilistic transcription
(4)
self-supervised learning
(4)
speech processing
(4)
multimodal learning
(3)
voice conversion
(3)
speech synthesis
(3)
under-resourced language
(3)
representation learning
(3)
convolutional neural network
(3)
spoken language understanding
(2)
speech enhancement
(2)
Papers
SyncDiff: Diffusion-Based Talking Head Synthesis with Bottlenecked Temporal Visual Prior for Improved Synchronization
WACV 2025
Fine-Tuning Automatic Speech Recognition for People with Parkinson's: An Effective Strategy for Enhancing Speech Technology Accessibility
INTERSPEECH 2024
TLCR: Token-Level Continuous Reward for Fine-grained Reinforcement Learning from Human Feedback
ACL 2024
Visualization for improving foreign language pronunciation
INTERSPEECH 2024
Finding Spoken Identifications: Using GPT-4 Annotation for an Efficient and Fast Dataset Creation Pipeline
COLING 2024
Enhancing Child Vocalization Classification with Phonetically-Tuned Embeddings for Assisting Autism Diagnosis
INTERSPEECH 2024
LI-TTA: Language Informed Test-Time Adaptation for Automatic Speech Recognition
INTERSPEECH 2024
A Theory of Unsupervised Speech Recognition
ACL 2023
End-to-End Zero-Shot Voice Conversion with Location-Variable Convolutions
INTERSPEECH 2023
Wav2ToBI: a new approach to automatic ToBI transcription
INTERSPEECH 2023
Listen, Decipher and Sign: Toward Unsupervised Speech-to-Sign Language Recognition
ACL 2023
Mitigating the Exposure Bias in Sentence-Level Grapheme-to-Phoneme (G2P) Transduction
INTERSPEECH 2023
INTapt: Information-Theoretic Adversarial Prompt Tuning for Enhanced Non-Native Speech Recognition
ACL 2023
One-Shot Exemplification Modeling via Latent Sense Representations
ACL 2023
Towards Robust Family-Infant Audio Analysis Based on Unsupervised Pretraining of Wav2vec 2.0 on Large-Scale Unlabeled Family Audio
INTERSPEECH 2023
One-Shot and Few-Shot Exemplification Modeling
EMNLP 2023
Equivariance Discovery by Learned Parameter-Sharing
AISTATS 2022
Self-supervised Semantic-driven Phoneme Discovery for Zero-resource Speech Recognition
ACL 2022
Fast and Efficient MMD-Based Fair PCA via Optimization over Stiefel Manifold
AAAI 2022
SMSMix: Sense-Maintained Sentence Mixup for Word Sense Disambiguation
EMNLP 2022
Forget-free Continual Learning with Winning Subnetworks
ICML 2022
ContentVec: An Improved Self-Supervised Speech Representation by Disentangling Speakers
ICML 2022
Unsupervised Text-to-Speech Synthesis by Unsupervised Automatic Speech Recognition
INTERSPEECH 2022
Cross-lingual articulatory feature information transfer for speech recognition using recurrent progressive neural networks
INTERSPEECH 2022
WavPrompt: Towards Few-Shot Spoken Language Understanding with Frozen Language Models
INTERSPEECH 2022
Frame-Level Stutter Detection
INTERSPEECH 2022
Syn2Vec: Synset Colexification Graphs for Lexical Semantic Similarity
NAACL 2022
Zero-Shot Cross-Lingual Phonetic Recognition with External Language Embedding
INTERSPEECH 2021
Classification of COVID-19 from Cough Using Autoregressive Predictive Coding Pretraining and Spectral Data Augmentation
INTERSPEECH 2021
Worldly Wise (WoW) - Cross-Lingual Knowledge Fusion for Fact-based Visual Spoken-Question Answering
NAACL 2021
Global Prosody Style Transfer Without Text Transcriptions
ICML 2021
Interpretable Visual Reasoning via Induced Symbolic Space
ICCV 2021
Context-Aware Automatic Text Simplification of Health Materials in Low-Resource Domains
EMNLP 2020
Unsupervised Speech Decomposition via Triple Information Bottleneck
ICML 2020
Autosegmental Neural Nets: Should Phones and Tones be Synchronous or Asynchronous?
INTERSPEECH 2020
Automatic Estimation of Intelligibility Measure for Consonants in Speech
INTERSPEECH 2020
A DNN-HMM-DNN Hybrid Model for Discovering Word-Like Units from Spoken Captions and Image Regions
INTERSPEECH 2020
Deep F-Measure Maximization for End-to-End Speech Understanding
INTERSPEECH 2020
Evaluating Automatically Generated Phoneme Captions for Images
INTERSPEECH 2020
Identify Speakers in Cocktail Parties with End-to-End Attention
INTERSPEECH 2020
That Sounds Familiar: An Analysis of Phonetic Representations Transfer Across Languages
INTERSPEECH 2020
AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss
ICML 2019
Improved ASR for Under-resourced Languages through Multi-task Learning with Acoustic Landmarks
INTERSPEECH 2018
Speaker Adaptive Audio-Visual Fusion for the Open-Vocabulary Section of AVICAR
INTERSPEECH 2018
Improving DNNs Trained with Non-Native Transcriptions Using Knowledge Distillation and Target Interpolation
INTERSPEECH 2018
Infant Emotional Outbursts Detection in Infant-parent Spoken Interactions
INTERSPEECH 2018
Topic and Keyword Identification for Low-resourced Speech Using Cross-Language Transfer Learning
INTERSPEECH 2018
Visualizing Phoneme Category Adaptation in Deep Neural Networks
INTERSPEECH 2018
Speech Enhancement Using Bayesian Wavenet
INTERSPEECH 2017
Deep Auto-Encoder Based Multi-Task Learning Using Probabilistic Transcriptions
INTERSPEECH 2017
Glottal Model Based Speech Beamforming for ad-hoc Microphone Arrays
INTERSPEECH 2017
Mismatched Crowdsourcing from Multiple Annotator Languages for Recognizing Zero-Resourced Languages: A Nullspace Clustering Approach
INTERSPEECH 2017
Semantic Image Inpainting With Deep Generative Models
CVPR 2017
Multi-Task Learning Using Mismatched Transcription for Under-Resourced Speech Recognition
INTERSPEECH 2017
Using Approximated Auditory Roughness as a Pre-Filtering Feature for Human Screaming and Affective Speech AED
INTERSPEECH 2017
Team ELISA System for DARPA LORELEI Speech Evaluation 2016
INTERSPEECH 2017
An Investigation on Training Deep Neural Networks Using Probabilistic Transcriptions
INTERSPEECH 2016
Analysis of Mismatched Transcriptions Generated by Humans and Machines for Under-Resourced Languages
INTERSPEECH 2016
Automatic Speech Recognition Using Probabilistic Transcriptions in Swahili, Amharic, and Dinka
INTERSPEECH 2016
A PAC-Bayesian Approach to Minimum Perplexity Language Modeling
COLING 2014
Detection of Acoustic-Phonetic Landmarks in Mismatched Conditions using a Biomimetic Model of Human Auditory Processing
COLING 2012