James Glass
120 papers · 2004–2025 · 15 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+17 more ↓ Show less ↑
πΊοΈ Taxonomy Completionist (28) π§ Keyword Pioneer π Interdisciplinary Bridge π Renaissance Researcher (5) π£ Hot Topic Early Bird
π
Renaissance Researcher
(5)
π
Interdisciplinary Bridge
π
Conference Polyglot
(15)
π
Conference Loyalist
(42)
π
Keyword Trendsetter Combo
(10)
π€
Dynamic Duo
(15)
π§¬
Topic Evolution
π
Keyword Champion
π±
Topic Pioneer
π¬
Deep Specialist
(22)
π
Trend Setter
π
Conference Pioneer
π₯
Unstoppable
(11)
β‘
Prolific Year
(24)
β
The Questioner
(5)
π
Century Club
(120)
ποΈ
Keyword Collector
(96)
Conferences
INTERSPEECH (42)
ACL (22)
EMNLP (13)
NAACL (12)
IJCNLP (5)
CVPR (4)
EACL (4)
NIPS (4)
AAAI (3)
ICLR (3)
COLING (2)
ICCV (2)
SEMEVAL (2)
AACL (1)
ECCV (1)
Top co-authors
Research topics
Keywords
self-supervised learning
(13)
representation learning
(12)
unsupervised learning
(10)
speech recognition
(9)
text classification
(8)
multimodal learning
(8)
transfer learning
(8)
automatic speech recognition
(8)
stance detection
(6)
speaker verification
(6)
convolutional neural network
(6)
video retrieval
(6)
deep neural network
(5)
language model
(5)
contrastive learning
(5)
recurrent neural network
(4)
zero-shot learning
(4)
domain adaptation
(4)
attention mechanism
(4)
neural machine translation
(4)
Papers
Teaching VLMs to Localize Specific Objects from In-context Examples
ICCV 2025
What When and Where? Self-Supervised Spatio-Temporal Grounding in Untrimmed Multi-Action Videos from Narrated Instructions
CVPR 2024
Natural Language Embedded Programs for Hybrid Language Symbolic Reasoning
NAACL 2024
Whisper-Flamingo: Integrating Visual Features into Whisper for Audio-Visual Speech Recognition and Translation
INTERSPEECH 2024
Automatic Prediction of Amyotrophic Lateral Sclerosis Progression using Longitudinal Speech Transformer
INTERSPEECH 2024
Self-Specialization: Uncovering Latent Expertise within Large Language Models
ACL 2024
Joint Inference of Retrieval and Generation for Passage Re-ranking
EACL 2024
R-Spin: Efficient Speaker and Noise-invariant Representation Learning with Acoustic Pieces
NAACL 2024
Found in the middle: Calibrating Positional Attention Bias Improves Long Context Utilization
ACL 2024
Revealing the Blind Spot of Sentence Encoder Evaluation by HEROS
ACL 2023
Expand, Rerank, and Retrieve: Query Reranking for Open-Domain Question Answering
ACL 2023
ConvRGX: Recognition, Generation, and Extraction for Self-trained Conversational Question Answering
ACL 2023
Search Augmented Instruction Learning
EMNLP 2023
On the Blind Spots of Model-Based Evaluation Metrics for Text Generation
ACL 2023
Entailment as Robust Self-Learner
ACL 2023
Comparison of Multilingual Self-Supervised and Weakly-Supervised Speech Pre-Training for Adaptation to Unseen Languages
INTERSPEECH 2023
Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong General Audio Event Taggers
INTERSPEECH 2023
Self-supervised Fine-tuning for Improved Content Representations by Speaker-invariant Clustering
INTERSPEECH 2023
Logic Against Bias: Textual Entailment Mitigates Stereotypical Sentence Reasoning
EACL 2023
PCFG-Based Natural Language Interface Improves Generalization for Controlled Text Generation
ACL 2023
Cross-Modal Discrete Representation Learning
ACL 2022
DiffCSE: Difference-based Contrastive Learning for Sentence Embeddings
NAACL 2022
Cooperative Self-training of Machine Reading Comprehension
NAACL 2022
Controlling the Focus of Pretrained Language Generation Models
ACL 2022
Everything at Once - Multi-Modal Fusion Transformer for Video Retrieval
CVPR 2022
Simple and Effective Unsupervised Speech Synthesis
INTERSPEECH 2022
Detecting Dementia from Long Neuropsychological Interviews
EMNLP 2022
SSAST: Self-Supervised Audio Spectrogram Transformer
AAAI 2022
Text-Free Image-to-Speech Synthesis Using Learned Segmental Units
IJCNLP 2021
Mitigating Biases in Toxic Language Detection through Invariant Rationalization
IJCNLP 2021
Spoken ObjectNet: A Bias-Controlled Spoken Caption Dataset
INTERSPEECH 2021
Joint Retrieval-Extraction Training for Evidence-Aware Dialog Response Selection
INTERSPEECH 2021
Cascaded Multilingual Audio-Visual Learning from Videos
INTERSPEECH 2021
Exposure Bias versus Self-Recovery: Are Distortions Really Incremental for Autoregressive Text Generation?
EMNLP 2021
Mitigating Biases in Toxic Language Detection through Invariant Rationalization
ACL 2021
Text-Free Image-to-Speech Synthesis Using Learned Segmental Units
ACL 2021
CLAC: A Speech Corpus of Healthy English Speakers
INTERSPEECH 2021
AVLnet: Learning Audio-Visual Language Representations from Instructional Videos
INTERSPEECH 2021
AST: Audio Spectrogram Transformer
INTERSPEECH 2021
Multimodal Clustering Networks for Self-Supervised Learning From Unlabeled Videos
ICCV 2021
Spoken Moments: Learning Joint Audio-Visual Representations From Video Descriptions
CVPR 2021
Analyzing the Forgetting Problem in Pretrain-Finetuning of Open-domain Dialogue Response Models
EACL 2021
Non-Autoregressive Predictive Coding for Learning Speech Representations from Local Dependencies
INTERSPEECH 2021
Negative Training for Neural Dialogue Response Generation
ACL 2020
What Was Written vs. Who Read It: News Media Profiling Using Text Analysis and Social Media Context
ACL 2020
Prototypical Q Networks for Automatic Conversational Diagnosis and Few-Shot New Disease Adaption
INTERSPEECH 2020
Vector-Quantized Autoregressive Predictive Coding
INTERSPEECH 2020
A Systematic Characterization of Sampling Algorithms for Open-ended Language Generation
AACL 2020
Multimodal Association for Speaker Verification
INTERSPEECH 2020
A Convolutional Deep Markov Model for Unsupervised Speech Representation Learning
INTERSPEECH 2020
We Can Detect Your Bias: Predicting the Political Ideology of News Articles
EMNLP 2020
Knowledge Grounded Conversational Symptom Detection with Graph Memory Networks
EMNLP 2020
Similarity Analysis of Contextual Word Representation Models
ACL 2020
Improved Speech Representations with Multi-Target Autoregressive Predictive Coding
ACL 2020
Pair Expansion for Learning Multilingual Semantic Embeddings Using Disjoint Visually-Grounded Speech Audio Datasets
INTERSPEECH 2020
What Does an End-to-End Dialect Identification Model Learn About Non-Dialectal Information?
INTERSPEECH 2020
Unsupervised Methods for Evaluating Speech Representations
INTERSPEECH 2020
Learning Hierarchical Discrete Linguistic Units from Visually-Grounded Speech
ICLR 2020
Tanbih: Get To Know What You Are Reading
IJCNLP 2019
Improving Neural Language Models by Segmenting, Attending, and Predicting the Future
ACL 2019
Learning Words by Drawing Images
CVPR 2019
Towards Bilingual Lexicon Discovery From Visually Grounded Speech Audio
INTERSPEECH 2019
Multi-Task Ordinal Regression for Jointly Predicting the Trustworthiness and the Leading Political Ideology of News Media
NAACL 2019
FAKTA: An Automatic End-to-End Fact Checking System
NAACL 2019
Team QCRI-MIT at SemEval-2019 Task 4: Propaganda Analysis Meets Hyperpartisan News Detection
SEMEVAL 2019
MCE 2018: The 1st Multi-Target Speaker Detection and Identification Challenge Evaluation
INTERSPEECH 2019
What Is One Grain of Sand in the Desert? Analyzing Individual Neurons in Deep NLP Models
AAAI 2019
Contrastive Language Adaptation for Cross-Lingual Stance Detection
EMNLP 2019
Tanbih: Get To Know What You Are Reading
EMNLP 2019
Neural Multi-Task Learning for Stance Prediction
EMNLP 2019
Integrating Video Retrieval and Moment Detection in a Unified Corpus for Video Question Answering
INTERSPEECH 2019
Transfer Learning from Audio-Visual Grounding to Speech Recognition
INTERSPEECH 2019
VoiceID Loss: Speech Enhancement for Speaker Verification
INTERSPEECH 2019
Detecting Egregious Responses in Neural Sequence-to-sequence Models
ICLR 2019
Identifying and Controlling Important Neurons in Neural Machine Translation
ICLR 2019
Contrastive Language Adaptation for Cross-Lingual Stance Detection
IJCNLP 2019
NeuroX: A Toolkit for Analyzing Individual Neurons in Neural Networks
AAAI 2019
A Comparison of Deep Learning Methods for Language Understanding
INTERSPEECH 2019
Multiple Sound Source Localization with SVD-PHAT
INTERSPEECH 2019
Analyzing Phonetic and Graphemic Representations in End-to-End Automatic Speech Recognition
INTERSPEECH 2019
An Unsupervised Autoregressive Model for Speech Representation Learning
INTERSPEECH 2019
A Deep Residual Network for Large-Scale Acoustic Scene Analysis
INTERSPEECH 2019
A Study of Enhancement, Augmentation and Autoencoder Methods for Domain Adaptation in Distant Speech Recognition
INTERSPEECH 2018
Unsupervised Cross-Modal Alignment of Speech and Text Embedding Spaces
NIPS 2018
Language Identification and Morphosyntactic Tagging: The Second VarDial Evaluation Campaign
COLING 2018
Jointly Discovering Visual Objects and Spoken Words from Raw Sensory Input
ECCV 2018
Predicting Factuality of Reporting and Bias of News Media Sources
EMNLP 2018
Speech2Vec: A Sequence-to-Sequence Framework for Learning Word Embeddings from Speech
INTERSPEECH 2018
Scalable Factorized Hierarchical Variational Autoencoder Training
INTERSPEECH 2018
Unsupervised Adaptation with Interpretable Disentangled Representations for Distant Conversational Speech Recognition
INTERSPEECH 2018
Detecting Depression with Audio/Text Sequence Modeling of Interviews
INTERSPEECH 2018
Automatic Stance Detection Using End-to-End Memory Networks
NAACL 2018
Supervised and Unsupervised Transfer Learning for Question Answering
NAACL 2018
Integrating Stance Detection and Fact Checking in a Unified Corpus
NAACL 2018
On the Evaluation of Semantic Phenomena in Neural Machine Translation Using Natural Language Inference
NAACL 2018
Role-specific Language Models for Processing Recorded Neuropsychological Exams
NAACL 2018
Evaluating Layers of Representation in Neural Machine Translation on Part-of-Speech and Semantic Tagging Tasks
IJCNLP 2017
What do Neural Machine Translation Models Learn about Morphology?
ACL 2017
Unsupervised Learning of Disentangled and Interpretable Representations from Sequential Data
NIPS 2017
Analyzing Hidden Representations in End-to-End Automatic Speech Recognition Systems
NIPS 2017
Learning Latent Representations for Speech Generation and Transformation
INTERSPEECH 2017
QMDIS: QCRI-MIT Advanced Dialect Identification System
INTERSPEECH 2017
An Environmental Feature Representation for Robust Speech Recognition and for Environment Identification
INTERSPEECH 2017
Character-Based Embedding Models and Reranking Strategies for Understanding Natural Language Meal Descriptions
INTERSPEECH 2017
Learning Word-Like Units from Joint Audio-Visual Analysis
ACL 2017
Automatic Dialect Detection in Arabic Broadcast Speech
INTERSPEECH 2016
Memory-Efficient Modeling and Search Techniques for Hardware ASR Decoders
INTERSPEECH 2016
Exploiting Depth and Highway Connections in Convolutional Recurrent Deep Neural Networks for Speech Recognition
INTERSPEECH 2016
Neural Attention for Learning to Rank Questions in Community Question Answering
COLING 2016
Unsupervised Learning of Spoken Language with Visual Context
NIPS 2016
VectorSLU: A Continuous Word Vector Approach to Answer Selection in Community Question Answering Systems
SEMEVAL 2015
Arabic Diacritization with Recurrent Neural Networks
EMNLP 2015
Joint Learning of Phonetic Units and Word Pronunciations for ASR
EMNLP 2013
A Nonparametric Bayesian Approach to Acoustic Model Discovery
ACL 2012
Syntactic Phrase Reordering for English-to-Arabic Statistical Machine Translation
EACL 2009
Segmentation for English-to-Arabic Statistical Machine Translation
ACL 2008
N-gram Weighting: Reducing Training Data Mismatch in Cross-Domain Language Model Estimation
EMNLP 2008
Making Sense of Sound: Unsupervised Topic Segmentation over Acoustic Input
ACL 2007
Style & Topic Language Model Adaptation Using HMM-LDA
EMNLP 2006
Feature-based Pronunciation Modeling for Speech Recognition
NAACL 2004