Leonid Sigal

74 papers · 2007–2026 · 11 conferences · across top CS/AI conferences

Achievements

+17 more ↓

🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🌉 Interdisciplinary Bridge 🗺️ Taxonomy Completionist (15) 🌍 Conference Polyglot (11)

🗺️ Taxonomy Completionist (15) 🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🌟 Keyword Trendsetter Combo (8) 🏠 Conference Loyalist (30) 🔬 Deep Specialist (13) 🧬 Topic Evolution 🏆 Keyword Champion (3) 🌱 Topic Pioneer 🏆 Grand Slam ⚡ Prolific Year (8) 🗃️ Keyword Collector (356) ❓ The Questioner (2) 💎 Century Club (74) 📈 Trend Setter 🔥 Unstoppable (14) 🚀 Conference Pioneer

Conferences

CVPR (30) NIPS (12) ICCV (9) ECCV (7) WACV (6) ACL (2) EMNLP (2) ICLR (2) ICML (2) AAAI (1) UAI (1)

Top co-authors

Mohammed Suhail (6) Greg Mori (6) Siddhesh Khandelwal (5) Tanzila Rahman (4) Raghav Goyal (4) Andreas Lehrmann (4) Aditya Chinchure (4) Michalis Raptis (4) Gunhee Kim (4) Renjie Liao (4)

Keywords

object detection (10) weakly supervised learning (6) few-shot learning (5) video understanding (5) multimodal learning (5) zero-shot learning (5) action recognition (4) scene graph generation (4) neural network (4) image generation (4) semantic segmentation (4) weakly-supervised learning (3) relation prediction (3) latent variable model (3) visual reasoning (3) attention mechanism (3) image segmentation (3) multi-modal learning (3) temporal modeling (3) visual grounding (3)

Papers

Test-Time Consistency in Vision Language Models WACV 2026 Mitigate One, Skew Another? Tackling Intersectional Biases in Text-to-Image Models EMNLP 2025 ChartGaze: Enhancing Chart Understanding in LVLMs with Eye-Tracking Guided Attention Refinement EMNLP 2025 Prompt2Perturb (P2P): Text-Guided Diffusion-Based Adversarial Attack on Breast Ultrasound Images CVPR 2025 LatentHOI: On the Generalizable Hand Object Motion Generation with Latent Hand Diffusion. CVPR 2025 TAM-VT: Transformation-Aware Multi-Scale Video Transformer for Segmentation and Tracking WACV 2025 Black Swan: Abductive and Defeasible Video Reasoning in Unpredictable Events CVPR 2025 Response Wide Shut? Surprising Observations in Basic Vision Language Model Capabilities ACL 2025 MM-R3: On (In-)Consistency of Vision-Language Models (VLMs) ACL 2025 Leveraging Online Olympiad-Level Math Problems for LLMs Training and Contamination-Resistant Evaluation ICML 2025 Revealing Weaknesses in Text Watermarking Through Self-Information Rewrite Attacks ICML 2025 Locality Sensitive Avatars From Video ICLR 2025 Framework-Agnostic Semantically-Aware Global Reasoning for Segmentation WACV 2024 Preventing Catastrophic Forgetting through Memory Networks in Continuous Detection ECCV 2024 Visual Prompting for Generalized Few-shot Segmentation: A Multi-scale Approach CVPR 2024 Emergent Open-Vocabulary Semantic Segmentation from Off-the-shelf Vision-Language Models CVPR 2024 Self-Supervised Relation Alignment for Scene Graph Generation WACV 2024 TIBET: Identifying and Evaluating Biases in Text-to-Image Generative Models ECCV 2024 Prompting Hard or Hardly Prompting: Prompt Inversion for Text-to-Image Diffusion Models CVPR 2024 Extending Video Masked Autoencoders to 128 frames NIPS 2024 Mitigating the Effect of Incidental Correlations on Part-based Learning NIPS 2023 VLC-BERT: Visual Question Answering With Contextualized Commonsense Knowledge WACV 2023 Self-supervision through Random Segments with Autoregressive Coding (RandSAC) ICLR 2023 Uncertainty Guided Adaptive Warping for Robust and Efficient Stereo Matching ICCV 2023 Make-a-Story: Visual Memory Conditioned Consistent Story Generation CVPR 2023 Omnimatte3D: Associating Objects and Their Effects in Unconstrained Monocular Video CVPR 2023 DINN360: Deformable Invertible Neural Network for Latitude-Aware 360deg Image Rescaling CVPR 2023 Iterative Scene Graph Generation NIPS 2022 Layered Controllable Video Generation ECCV 2022 Light Field Neural Rendering CVPR 2022 Generalizable Patch-Based Neural Rendering ECCV 2022 Segmentation-Grounded Scene Graph Generation ICCV 2021 Referring Transformer: A One-step Approach to Multi-task Visual Grounding NIPS 2021 Energy-Based Learning for Scene Graph Generation CVPR 2021 UniT: Unified Knowledge Transfer for Any-Shot Object Detection and Segmentation CVPR 2021 PROVIDE: a probabilistic framework for unsupervised video decomposition UAI 2021 TriBERT: Human-centric Audio-visual Representation Learning NIPS 2021 Person-in-Context Synthesis With Compositional Structural Space WACV 2021 Saliency-Guided Image Translation CVPR 2021 Front2Back: Single View 3D Shape Reconstruction via Front to Back Prediction CVPR 2020 Generating Videos of Zero-Shot Compositions of Actions and Objects ECCV 2020 Improved Few-Shot Visual Classification CVPR 2020 Neural Sequential Phrase Grounding (SeqGROUND) CVPR 2019 Multilevel Language and Vision Integration for Text-to-Clip Retrieval AAAI 2019 A Variational Auto-Encoder Model for Stochastic Point Processes CVPR 2019 Image Generation From Layout CVPR 2019 AttentionRNN: A Structured Spatial Attention Mechanism ICCV 2019 G3raphGround: Graph-Based Language Grounding ICCV 2019 Watch, Listen and Tell: Multi-Modal Weakly Supervised Dense Event Captioning ICCV 2019 LayoutVAE: Stochastic Scene Layout Generation From a Label Set ICCV 2019 Mixture-Kernel Graph Attention Network for Situation Recognition ICCV 2019 Show Me a Story: Towards Coherent Neural Story Illustration CVPR 2018 A Neural Multi-Sequence Alignment TeCHnique (NeuMATCH) CVPR 2018 Middle-Out Decoding NIPS 2018 Probabilistic Video Generation using Holistic Attribute Control ECCV 2018 Modular Generative Adversarial Networks ECCV 2018 Weakly-Supervised Visual Grounding of Phrases With Linguistic Structures CVPR 2017 Visual Reference Resolution using Attention Memory for Visual Dialog NIPS 2017 Non-parametric Structured Output Networks NIPS 2017 Semi-Supervised Vocabulary-Informed Learning CVPR 2016 Learning Activity Progression in LSTMs for Activity Detection and Early Detection CVPR 2016 Harnessing Object and Scene Semantics for Large-Scale Video Understanding CVPR 2016 Ranking and Retrieval of Image Sequences From Multiple Paragraph Queries CVPR 2015 Expanding Object Detector's Horizon: Incremental Learning Framework for Object Detection in Videos CVPR 2015 Joint Photo Stream and Blog Post Summarization and Exploration CVPR 2015 Space-Time Tree Ensemble for Action Recognition CVPR 2015 Storyline Representation of Egocentric Videos With an Applications to Story-Based Search ICCV 2015 Joint Summarization of Large-scale Collections of Web Images and Videos for Storyline Reconstruction CVPR 2014 A Unified Semantic Embedding: Relating Taxonomies and Attributes NIPS 2014 Poselet Key-Framing: A Model for Human Activity Recognition CVPR 2013 From Subcategories to Visual Composites: A Multi-level Framework for Object Detection ICCV 2013 Action is in the Eye of the Beholder: Eye-gaze Driven Model for Spatio-Temporal Action Localization NIPS 2013 Facial Expression Transfer with Input-Output Temporal Restricted Boltzmann Machines NIPS 2011 Combined discriminative and generative articulated pose and non-rigid shape estimation NIPS 2007