Idan Schwartz
16 papers · 2017–2026 · 9 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+9 more ↓ Show less ↑
π Renaissance Researcher (6) π Interdisciplinary Bridge π Conference Polyglot (8) π Academic Marathon (9) πΊοΈ Taxonomy Completionist (33)
π§
Keyword Pioneer
π£
Hot Topic Early Bird
π
Conference Polyglot
(8)
β‘
Prolific Year
(5)
π
Century Club
(15)
β
The Questioner
π
Trend Setter
π₯
Unstoppable
(6)
ποΈ
Keyword Collector
(76)
Conferences
NIPS (4)
CVPR (3)
AAAI (2)
WACV (2)
ACL (1)
EMNLP (1)
ICCV (1)
INTERSPEECH (1)
NAACL (1)
Top co-authors
Keywords
multimodal learning
(4)
attention mechanism
(3)
diffusion model
(3)
visual question answering
(3)
dialogue system
(2)
text-to-image generation
(2)
visual dialog
(2)
multi-modal learning
(2)
latent space
(2)
vision-language model
(2)
object detection
(1)
question answering
(1)
model evaluation
(1)
text generation
(1)
principal component analysis
(1)
vision transformer
(1)
self-supervised learning
(1)
video understanding
(1)
video synthesis
(1)
domain generalization
(1)
Papers
LaMI: Augmenting Large Language Models via Late Multi-Image Fusion
ACL 2026
Detection-Driven Object Count Optimization for Text-to-Image Diffusion Models
WACV 2026
Diverse and Aligned Audio-to-Video Generation via Text-to-Video Model Adaptation
AAAI 2024
Adaptation of Text-Conditioned Diffusion Models for Audio-to-Image Generation
INTERSPEECH 2023
Discriminative Class Tokens for Text-to-Image Diffusion Models
ICCV 2023
ZeroCap: Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic
CVPR 2022
Optimizing Relevance Maps of Vision Transformers Improves Robustness
NIPS 2022
Latent Space Explanation by Intervention
AAAI 2022
Describing Sets of Images with Textual-PCA
EMNLP 2022
Video and Text Matching With Conditioned Embeddings
WACV 2022
Ensemble of MRR and NDCG models for Visual Dialog
NAACL 2021
Perceptual Score: What Data Modalities Does Your Model Perceive?
NIPS 2021
Removing Bias in Multi-modal Classifiers: Regularization by Maximizing Functional Entropies
NIPS 2020
A Simple Baseline for Audio-Visual Scene-Aware Dialog
CVPR 2019
Factor Graph Attention
CVPR 2019
High-Order Attention Models for Visual Question Answering
NIPS 2017