Abdelrahman Mohamed
28 papers · 2015–2026 · 9 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+10 more ↓ Show less ↑
🌍 Conference Polyglot (8) 🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🌉 Interdisciplinary Bridge 🏃 Academic Marathon (10)
🐣
Hot Topic Early Bird
🌉
Interdisciplinary Bridge
🌍
Conference Polyglot
(8)
🧬
Topic Evolution
👥
Mega-Team
(27)
🔬
Deep Specialist
(12)
🗃️
Keyword Collector
(97)
💎
Century Club
(27)
🔥
Unstoppable
(6)
⚡
Prolific Year
(11)
Conferences
INTERSPEECH (10)
ACL (7)
EMNLP (4)
NAACL (2)
EACL (1)
ICLR (1)
ICML (1)
IJCNLP (1)
NIPS (1)
Top co-authors
Research topics
Keywords
self-supervised learning
(10)
automatic speech recognition
(6)
speech recognition
(4)
vision-language model
(3)
speech representation
(3)
representation learning
(3)
speech processing
(3)
speech synthesis
(3)
zero-shot learning
(3)
arabic language
(2)
contrastive learning
(2)
neural vocoder
(2)
speech resynthesis
(2)
multilingual speech
(2)
transfer learning
(2)
discrete representation
(2)
image captioning
(2)
generative model
(2)
speaker identity
(2)
voice conversion
(1)
Papers
JEEM: Vision-Language Understanding in Four Arabic Dialects
EACL 2026
LLMs Can Compensate for Deficiencies in Visual Representations
EMNLP 2025
VoiceCraft: Zero-Shot Speech Editing and Text-to-Speech in the Wild
ACL 2024
Peacock: A Family of Arabic Multimodal Large Language Models and Benchmarks
ACL 2024
Casablanca: Data and Models for Multidialectal Arabic Speech Recognition
EMNLP 2024
Syllable Discovery and Cross-Lingual Generalization in a Visually Grounded, Self-Supervised Speech Model
INTERSPEECH 2023
Violet: A Vision-Language Model for Arabic Image Captioning with Gemini Decoder
EMNLP 2023
ML-SUPERB: Multilingual Speech Universal PERformance Benchmark
INTERSPEECH 2023
Scaling ASR Improves Zero and Few Shot Learning
INTERSPEECH 2022
Unified Speech-Text Pre-training for Speech Translation and Recognition
ACL 2022
SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark for Semantic and Generative Capabilities
ACL 2022
Text-Free Prosody-Aware Generative Spoken Language Modeling
ACL 2022
Textless Speech Emotion Conversion using Discrete & Decomposed Representations
EMNLP 2022
Learning Audio-Visual Speech Representation by Masked Multimodal Cluster Prediction
ICLR 2022
Self-supervised Representation Learning for Speech Processing
NAACL 2022
textless-lib: a Library for Textless Spoken Language Processing
NAACL 2022
Robust Self-Supervised Audio-Visual Speech Recognition
INTERSPEECH 2022
Learning Lip-Based Audio-Visual Speaker Embeddings with AV-HuBERT
INTERSPEECH 2022
DUAL: Discrete Spoken Unit Adaptive Learning for Textless Spoken Question Answering
INTERSPEECH 2022
SUPERB: Speech Processing Universal PERformance Benchmark
INTERSPEECH 2021
Unsupervised Cross-Lingual Representation Learning for Speech Recognition
INTERSPEECH 2021
Speech Resynthesis from Discrete Disentangled Self-Supervised Representations
INTERSPEECH 2021
wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations
NIPS 2020
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension
ACL 2020
Large Scale Weakly and Semi-Supervised Learning for Low-Resource Video ASR
INTERSPEECH 2020
Sequence Modeling via Segmentations
ICML 2017
Learning Lexical Embeddings with Syntactic and Lexicographic Knowledge
IJCNLP 2015
Learning Lexical Embeddings with Syntactic and Lexicographic Knowledge
ACL 2015