Stavros Petridis

21 papers · 2019–2025 · 6 conferences · across top CS/AI conferences

Achievements

+9 more ↓

🏃 Academic Marathon (6) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🌍 Conference Polyglot (6) 🐣 Hot Topic Early Bird

🌍 Conference Polyglot (6) 🏃 Academic Marathon (6) 🧭 Keyword Pioneer 🧬 Topic Evolution 🤝 Dynamic Duo (19) 🗃️ Keyword Collector (88) ⚡ Prolific Year (6) 💎 Century Club (21) 🔥 Unstoppable (7)

Conferences

INTERSPEECH (10) CVPR (5) ICLR (2) WACV (2) ICCV (1) NIPS (1)

Top co-authors

Maja Pantic (19) Pingchuan Ma (10) Konstantinos Vougioukas (7) Rodrigo Mira (6) Alexandros Haliassos (6) Honglie Chen (5) Zoe Landgraf (3) Adriana Fernandez-Lopez (3) Nikita Drobyshev (2) Qiao Xiao (2)

Keywords

lip reading (4) self-supervised learning (4) automatic speech recognition (4) speech recognition (3) audio-visual speech recognition (3) visual speech recognition (3) semi-supervised learning (2) face forgery detection (2) end-to-end model (2) audiovisual speech recognition (2) diffusion model (2) facial expression (2) neural vocoder (2) face animation (1) talking face generation (1) domain generalization (1) video classification (1) multi-task learning (1) representation learning (1) video generation (1)

Papers

KeyFace: Expressive Audio-Driven Facial Animation for Long Sequences via KeyFrame Interpolation CVPR 2025 Zero-AVSR: Zero-Shot Audio-Visual Speech Recognition with LLMs by Learning Language-Agnostic Speech Representations ICCV 2025 Unified Speech Recognition: A Single Model for Auditory, Visual, and Audiovisual Inputs NIPS 2024 EMOPortraits: Emotion-enhanced Multimodal One-shot Head Avatars CVPR 2024 RT-LA-VocE: Real-Time Low-SNR Audio-Visual Speech Enhancement INTERSPEECH 2024 MSRS: Training Multimodal Speech Recognition Models from Scratch with Sparse Mask Optimization INTERSPEECH 2024 Dynamic Data Pruning for Automatic Speech Recognition INTERSPEECH 2024 Diffused Heads: Diffusion Models Beat GANs on Talking-Face Generation WACV 2024 Jointly Learning Visual and Auditory Speech Representations from Raw Data ICLR 2023 Streaming Audio-Visual Speech Recognition with Alignment Regularization INTERSPEECH 2023 SparseVSR: Lightweight and Noise Robust Visual Speech Recognition INTERSPEECH 2023 SynthVSR: Scaling Up Visual Speech Recognition With Synthetic Supervision CVPR 2023 Leveraging Real Talking Faces via Self-Supervision for Robust Forgery Detection CVPR 2022 SVTS: Scalable Video-to-Speech Synthesis INTERSPEECH 2022 DINO: A Conditional Energy-Based GAN for Domain Translation ICLR 2021 Lip-Reading With Densely Connected Temporal Convolutional Networks WACV 2021 LiRA: Learning Visual Speech Representations from Audio Through Self-Supervision INTERSPEECH 2021 Lips Don't Lie: A Generalisable and Robust Approach To Face Forgery Detection CVPR 2021 Domain Adversarial Neural Networks for Dysarthric Speech Recognition INTERSPEECH 2020 Investigating the Lombard Effect Influence on End-to-End Audio-Visual Speech Recognition INTERSPEECH 2019 Video-Driven Speech Reconstruction Using Generative Adversarial Networks INTERSPEECH 2019