Idan Schwartz

16 papers · 2017–2026 · 9 conferences · across top CS/AI conferences

Achievements

+9 more ↓

🌈 Renaissance Researcher (6) 🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (8) 🏃 Academic Marathon (9) 🗺️ Taxonomy Completionist (33)

🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🌍 Conference Polyglot (8) ⚡ Prolific Year (5) 💎 Century Club (15) ❓ The Questioner 📈 Trend Setter 🔥 Unstoppable (6) 🗃️ Keyword Collector (76)

Conferences

NIPS (4) CVPR (3) AAAI (2) WACV (2) ACL (1) EMNLP (1) ICCV (1) INTERSPEECH (1) NAACL (1)

Top co-authors

Lior Wolf (8) Tamir Hazan (6) Itai Gat (5) Sagie Benaim (3) Yossi Adi (3) Guy Yariv (3) Alexander G. Schwing (2) Hila Chefer (2) Alexander Schwing (2) Seunghak Yu (1)

Keywords

multimodal learning (4) attention mechanism (3) diffusion model (3) visual question answering (3) dialogue system (2) text-to-image generation (2) visual dialog (2) multi-modal learning (2) latent space (2) vision-language model (2) object detection (1) question answering (1) model evaluation (1) text generation (1) principal component analysis (1) vision transformer (1) self-supervised learning (1) video understanding (1) video synthesis (1) domain generalization (1)

Papers

LaMI: Augmenting Large Language Models via Late Multi-Image Fusion ACL 2026 Detection-Driven Object Count Optimization for Text-to-Image Diffusion Models WACV 2026 Diverse and Aligned Audio-to-Video Generation via Text-to-Video Model Adaptation AAAI 2024 Adaptation of Text-Conditioned Diffusion Models for Audio-to-Image Generation INTERSPEECH 2023 Discriminative Class Tokens for Text-to-Image Diffusion Models ICCV 2023 ZeroCap: Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic CVPR 2022 Optimizing Relevance Maps of Vision Transformers Improves Robustness NIPS 2022 Latent Space Explanation by Intervention AAAI 2022 Describing Sets of Images with Textual-PCA EMNLP 2022 Video and Text Matching With Conditioned Embeddings WACV 2022 Ensemble of MRR and NDCG models for Visual Dialog NAACL 2021 Perceptual Score: What Data Modalities Does Your Model Perceive? NIPS 2021 Removing Bias in Multi-modal Classifiers: Regularization by Maximizing Functional Entropies NIPS 2020 A Simple Baseline for Audio-Visual Scene-Aware Dialog CVPR 2019 Factor Graph Attention CVPR 2019 High-Order Attention Models for Visual Question Answering NIPS 2017