Adam Polyak

21 papers · 2017–2025 · 9 conferences · across top CS/AI conferences

Achievements

+11 more ↓

🌍 Conference Polyglot (9) 🏃 Academic Marathon (8) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🐝 Cross-Pollinator (5)

🐝 Cross-Pollinator (5) 🌈 Renaissance Researcher (7) 🗺️ Taxonomy Completionist (33) 🤝 Dynamic Duo (15) 👑 Triple Crown 🧬 Topic Evolution 🔥 Unstoppable (9) 💎 Century Club (21) 🗃️ Keyword Collector (67) ⚡ Prolific Year (5) 🚀 Conference Pioneer

Conferences

ICLR (5) ICML (3) INTERSPEECH (3) ACL (2) CVPR (2) ECCV (2) EMNLP (2) ICCV (1) NIPS (1)

Top co-authors

Yaniv Taigman (15) Uriel Singer (8) Yossi Adi (8) Lior Wolf (7) Shelly Sheynin (7) Devi Parikh (6) Yuval Kirstain (5) Wei-Ning Hsu (5) Oron Ashual (5) Jade Copet (4)

Keywords

speech synthesis (3) discrete representation (3) speaker identity (3) self-supervised learning (2) autoregressive model (2) generative model (2) neural vocoder (2) diffusion model (2) waveform generation (2) video generation (2) voice conversion (1) object tracking (1) acoustic model (1) image inpainting (1) text-to-image generation (1) few-shot learning (1) automatic speech recognition (1) disentangled representation (1) image-to-video synthesis (1) motion trajectories (1)

Papers

Through-The-Mask: Mask-based Motion Trajectories for Image-to-Video Generation CVPR 2025 VideoJAM: Joint Appearance-Motion Representations for Enhanced Motion Generation in Video Models ICML 2025 Video Editing via Factorized Diffusion Distillation ECCV 2024 Emu Edit: Precise Image Editing via Recognition and Generation Tasks CVPR 2024 Make-A-Video: Text-to-Video Generation without Text-Video Data ICLR 2023 kNN-Diffusion: Image Generation via Large-Scale Retrieval ICLR 2023 Pick-a-Pic: An Open Dataset of User Preferences for Text-to-Image Generation NIPS 2023 Text-To-4D Dynamic Scene Generation ICML 2023 AudioGen: Textually Guided Audio Generation ICLR 2023 Text-Free Prosody-Aware Generative Spoken Language Modeling ACL 2022 Make-a-Scene: Scene-Based Text-to-Image Generation with Human Priors ECCV 2022 Direct Speech-to-Speech Translation With Discrete Units ACL 2022 Textless Speech Emotion Conversion using Discrete & Decomposed Representations EMNLP 2022 Speech Resynthesis from Discrete Disentangled Self-Supervised Representations INTERSPEECH 2021 fairseq Sˆ2: A Scalable and Integrable Speech Synthesis Toolkit EMNLP 2021 TTS Skins: Speaker Conversion via ASR INTERSPEECH 2020 Unsupervised Cross-Domain Singing Voice Conversion INTERSPEECH 2020 A Universal Music Translation Network ICLR 2019 Fitting New Speakers Based on a Short Untranscribed Sample ICML 2018 VoiceLoop: Voice Fitting and Synthesis via a Phonological Loop ICLR 2018 Unsupervised Creation of Parameterized Avatars ICCV 2017