Abdelrahman Mohamed

28 papers · 2015–2026 · 9 conferences · across top CS/AI conferences

Achievements

+10 more ↓

🌍 Conference Polyglot (8) 🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🌉 Interdisciplinary Bridge 🏃 Academic Marathon (10)

🐣 Hot Topic Early Bird 🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (8) 🧬 Topic Evolution 👥 Mega-Team (27) 🔬 Deep Specialist (12) 🗃️ Keyword Collector (97) 💎 Century Club (27) 🔥 Unstoppable (6) ⚡ Prolific Year (11)

Conferences

INTERSPEECH (10) ACL (7) EMNLP (4) NAACL (2) EACL (1) ICLR (1) ICML (1) IJCNLP (1) NIPS (1)

Top co-authors

Wei-Ning Hsu (8) Shang-Wen Li (7) Kushal Lakhotia (6) Hung-yi Lee (5) Shu-wen Yang (4) Shinji Watanabe (4) Jade Copet (4) Emmanuel Dupoux (4) Yossi Adi (4) Eugene Kharitonov (4)

Research topics

Speech & Audio (1) Analysis (1)

Keywords

self-supervised learning (10) automatic speech recognition (6) speech recognition (4) vision-language model (3) speech representation (3) representation learning (3) speech processing (3) speech synthesis (3) zero-shot learning (3) arabic language (2) contrastive learning (2) neural vocoder (2) speech resynthesis (2) multilingual speech (2) transfer learning (2) discrete representation (2) image captioning (2) generative model (2) speaker identity (2) voice conversion (1)

Papers

JEEM: Vision-Language Understanding in Four Arabic Dialects EACL 2026 LLMs Can Compensate for Deficiencies in Visual Representations EMNLP 2025 VoiceCraft: Zero-Shot Speech Editing and Text-to-Speech in the Wild ACL 2024 Peacock: A Family of Arabic Multimodal Large Language Models and Benchmarks ACL 2024 Casablanca: Data and Models for Multidialectal Arabic Speech Recognition EMNLP 2024 Syllable Discovery and Cross-Lingual Generalization in a Visually Grounded, Self-Supervised Speech Model INTERSPEECH 2023 Violet: A Vision-Language Model for Arabic Image Captioning with Gemini Decoder EMNLP 2023 ML-SUPERB: Multilingual Speech Universal PERformance Benchmark INTERSPEECH 2023 Scaling ASR Improves Zero and Few Shot Learning INTERSPEECH 2022 Unified Speech-Text Pre-training for Speech Translation and Recognition ACL 2022 SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark for Semantic and Generative Capabilities ACL 2022 Text-Free Prosody-Aware Generative Spoken Language Modeling ACL 2022 Textless Speech Emotion Conversion using Discrete & Decomposed Representations EMNLP 2022 Learning Audio-Visual Speech Representation by Masked Multimodal Cluster Prediction ICLR 2022 Self-supervised Representation Learning for Speech Processing NAACL 2022 textless-lib: a Library for Textless Spoken Language Processing NAACL 2022 Robust Self-Supervised Audio-Visual Speech Recognition INTERSPEECH 2022 Learning Lip-Based Audio-Visual Speaker Embeddings with AV-HuBERT INTERSPEECH 2022 DUAL: Discrete Spoken Unit Adaptive Learning for Textless Spoken Question Answering INTERSPEECH 2022 SUPERB: Speech Processing Universal PERformance Benchmark INTERSPEECH 2021 Unsupervised Cross-Lingual Representation Learning for Speech Recognition INTERSPEECH 2021 Speech Resynthesis from Discrete Disentangled Self-Supervised Representations INTERSPEECH 2021 wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations NIPS 2020 BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension ACL 2020 Large Scale Weakly and Semi-Supervised Learning for Low-Resource Video ASR INTERSPEECH 2020 Sequence Modeling via Segmentations ICML 2017 Learning Lexical Embeddings with Syntactic and Lexicographic Knowledge IJCNLP 2015 Learning Lexical Embeddings with Syntactic and Lexicographic Knowledge ACL 2015