Alexander Richard
24 papers · 2016–2025 · 8 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+10 more ↓ Show less ↑
🏃 Academic Marathon (9) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🌍 Conference Polyglot (8) 🐝 Cross-Pollinator (6)
🐣
Hot Topic Early Bird
🏃
Academic Marathon
(9)
🧭
Keyword Pioneer
🏆
Keyword Champion
(2)
🔥
Unstoppable
(5)
⚡
Prolific Year
(5)
💎
Century Club
(24)
❓
The Questioner
📈
Trend Setter
🗃️
Keyword Collector
(128)
Conferences
CVPR (11)
INTERSPEECH (4)
ECCV (2)
ICCV (2)
ICLR (2)
ICML (1)
NIPS (1)
WACV (1)
Top co-authors
Keywords
multimodal learning
(4)
speech synthesis
(4)
speech enhancement
(3)
weakly supervised learning
(3)
action recognition
(3)
multi-modal learning
(2)
temporal segmentation
(2)
recurrent neural network
(2)
action segmentation
(2)
spatial audio
(2)
neural radiance field
(2)
diffusion model
(2)
novel view synthesis
(2)
novel-view synthesis
(2)
acoustic synthesis
(2)
audio-visual learning
(2)
point cloud
(1)
source separation
(1)
speech processing
(1)
video segmentation
(1)
Papers
AV-Flow: Transforming Text to Audio-Visual Human-like Interactions
ICCV 2025
REWIND: Real-Time Egocentric Whole-Body Motion Diffusion with Exemplar-Based Identity Conditioning
CVPR 2025
SoundVista: Novel-View Ambient Sound Synthesis via Visual-Acoustic Binding
CVPR 2025
BinauralFlow: A Causal and Streamable Approach for High-Quality Binaural Speech Synthesis with Flow Matching Models
ICML 2025
FlowDec: A flow-based full-band general audio codec with high perceptual quality
ICLR 2025
EARS: An Anechoic Fullband Speech Dataset Benchmarked for Speech Enhancement and Dereverberation
INTERSPEECH 2024
Modeling and Driving Human Body Soundfields through Acoustic Primitives
ECCV 2024
From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations
CVPR 2024
Real Acoustic Fields: An Audio-Visual Room Acoustics Dataset and Benchmark
CVPR 2024
Spatialization Quality Metric for Binaural Speech
INTERSPEECH 2023
Novel-View Acoustic Synthesis
CVPR 2023
Sounding Bodies: Modeling 3D Spatial Sound of Humans Using Body Pose and Audio
NIPS 2023
LiP-Flow: Learning Inference-Time Priors for Codec Avatars via Normalizing Flows in Latent Space
ECCV 2022
Audio-Visual Speech Codecs: Rethinking Audio-Visual Speech Enhancement by Re-Synthesis
CVPR 2022
End-to-End Binaural Speech Synthesis
INTERSPEECH 2022
Implicit Neural Spatial Filtering for Multichannel Source Separation in the Waveform Domain
INTERSPEECH 2022
Neural Synthesis of Binaural Speech From Mono Audio
ICLR 2021
Audio- and Gaze-Driven Facial Animation of Codec Avatars
WACV 2021
MeshTalk: 3D Face Animation From Speech Using Cross-Modality Disentanglement
ICCV 2021
NeuralNetwork-Viterbi: A Framework for Weakly Supervised Video Learning
CVPR 2018
Action Sets: Weakly Supervised Action Segmentation Without Ordering Constraints
CVPR 2018
When Will You Do What? - Anticipating Temporal Occurrences of Activities
CVPR 2018
Weakly Supervised Action Learning With RNN Based Fine-To-Coarse Modeling
CVPR 2017
Temporal Action Detection Using a Statistical Language Model
CVPR 2016