Kate Saenko

124 papers · 2008–2025 · 15 conferences · across top CS/AI conferences

Achievements

+20 more ↓

🗺️ Taxonomy Completionist (16) 🧭 Keyword Pioneer 🌈 Renaissance Researcher (6) 🌉 Interdisciplinary Bridge 🐣 Hot Topic Early Bird

🏃 Academic Marathon (17) 🌈 Renaissance Researcher (6) 🌉 Interdisciplinary Bridge 🏠 Conference Loyalist (29) 🌟 Keyword Trendsetter Combo (11) 🤝 Dynamic Duo (32) 👑 Triple Crown 🌱 Topic Pioneer 🏆 Grand Slam 👥 Mega-Team (20) 🔬 Deep Specialist (27) 🧬 Topic Evolution 🏆 Keyword Champion (2) 🗃️ Keyword Collector (418) ⚡ Prolific Year (19) 💎 Century Club (124) ❓ The Questioner (4) 📈 Trend Setter 🚀 Conference Pioneer 🔥 Unstoppable (13)

Conferences

CVPR (29) ICCV (24) NIPS (19) ECCV (16) ICLR (10) WACV (9) EMNLP (5) ACL (3) AAAI (2) ICML (2) COLING (1) CORL (1) JMLR (1) NAACL (1) RSS (1)

Top co-authors

Trevor Darrell (32) Bryan A. Plummer (18) Kuniaki Saito (15) Rogerio Feris (15) Rameswar Panda (14) Stan Sclaroff (14) Judy Hoffman (12) Donghyun Kim (10) Ximeng Sun (9) Marcus Rohrbach (9)

Keywords

domain adaptation (17) transfer learning (13) multimodal learning (11) zero-shot learning (10) object detection (10) unsupervised learning (9) contrastive learning (9) vision-language model (8) representation learning (8) image captioning (7) semi-supervised learning (6) weakly supervised learning (6) self-supervised learning (6) visual grounding (5) adversarial learning (5) convolutional neural network (5) video understanding (5) action recognition (5) semantic segmentation (4) image translation (4)

Papers

KiVA: Kid-inspired Visual Analogies for Testing Large Multimodal Models ICLR 2025 ERM++: An Improved Baseline for Domain Generalization WACV 2025 Scaling Up Temporal Domain Generalization via Temporal Experts Averaging EMNLP 2025 SPARC: Score Prompting and Adaptive Fusion for Zero-Shot Multi-Label Recognition in Vision-Language Models CVPR 2025 Web Artifact Attacks Disrupt Vision Language Models ICCV 2025 Is Large-scale Pretraining the Secret to Good Domain Generalization? ICLR 2025 Concept Arithmetics for Circumventing Concept Inhibition in Diffusion Models ECCV 2024 Tell Me What’s Next: Textual Foresight for Generic UI Representations ACL 2024 Improved Techniques for Quantizing Deep Networks With Adaptive Bit-Widths WACV 2024 Learning To Compose SuperWeights for Neural Parameter Allocation Search WACV 2024 From Fake to Real: Pretraining on Balanced Synthetic Images to Prevent Spurious Correlations in Image Recognition ECCV 2024 Koala: Key Frame-Conditioned Long Video-LLM CVPR 2024 RIFT: Disentangled Unsupervised Image Translation via Restricted Information Flow WACV 2023 A Suite of Generative Tasks for Multi-Level Multimodal Webpage Understanding EMNLP 2023 Cola: A Benchmark for Compositional Text-to-image Retrieval NIPS 2023 DIME-FM : DIstilling Multimodal and Efficient Foundation Models ICCV 2023 Prefix Conditioning Unifies Language and Label Supervision CVPR 2023 Language-Guided Audio-Visual Source Separation via Trimodal Consistency CVPR 2023 Bias Mimicking: A Simple Sampling Approach for Bias Mitigation CVPR 2023 Pic2Word: Mapping Pictures to Words for Zero-Shot Composed Image Retrieval CVPR 2023 MaskSketch: Unpaired Structure-Guided Masked Image Generation CVPR 2023 Select, Label, and Mix: Learning Discriminative Invariant Feature Representations for Partial Domain Adaptation WACV 2023 Neural Parameter Allocation Search ICLR 2022 NewsStories: Illustrating Articles with Visual Summaries ECCV 2022 ZeroWaste Dataset: Towards Deformable Object Segmentation in Cluttered Scenes CVPR 2022 Task2Sim: Towards Effective Pre-Training and Transfer From Synthetic Data CVPR 2022 Unsupervised Domain Generalization by Learning a Bridge Across Domains CVPR 2022 Many-to-Many Splatting for Efficient Video Frame Interpolation CVPR 2022 MetaPose: Fast 3D Pose From Multiple Views Without 3D Supervision CVPR 2022 Learning to Detect Every Thing in an Open World ECCV 2022 The Abduction of Sherlock Holmes: A Dataset for Visual Abductive Reasoning ECCV 2022 A Broad Study of Pre-training for Domain Generalization and Adaptation ECCV 2022 A Dataset for Interactive Vision-Language Navigation with Unknown Command Feasibility ECCV 2022 Finding Differences Between Transformers and ConvNets Using Counterfactual Simulation Testing NIPS 2022 FETA: Towards Specializing Foundational Models for Expert Task Applications NIPS 2022 DualCoOp: Fast Adaptation to Multi-Label Recognition with Limited Annotations NIPS 2022 How Transferable are Video Representations Based on Synthetic Data? NIPS 2022 A Unified Framework for Domain Adaptive Pose Estimation ECCV 2022 Evaluation of Correctness in Unsupervised Many-to-Many Image Translation WACV 2022 Extending the WILDS Benchmark for Unsupervised Adaptation ICLR 2022 Multi-Critic Actor Learning: Teaching RL Policies to Act with Style ICLR 2022 OpenMatch: Open-Set Semi-supervised Learning with Open-set Consistency Regularization NIPS 2021 AdaFuse: Adaptive Temporal Fusion Network for Efficient Action Recognition ICLR 2021 VA-RED$^2$: Video Adaptive Redundancy Reduction ICLR 2021 Dynamic Network Quantization for Efficient Video Inference ICCV 2021 Tune It the Right Way: Unsupervised Validation of Domain Adaptation via Soft Neighborhood Density ICCV 2021 Learning Cross-Modal Contrastive Features for Video Domain Adaptation ICCV 2021 CDS: Cross-Domain Self-Supervised Pre-Training ICCV 2021 Temporal Action Detection With Multi-Level Supervision ICCV 2021 OVANet: One-vs-All Network for Universal Domain Adaptation ICCV 2021 Active Domain Adaptation via Clustering Uncertainty-Weighted Embeddings ICCV 2021 Separating Skills and Concepts for Novel Visual Question Answering CVPR 2021 Semi-Supervised Action Recognition With Temporal Contrastive Learning CVPR 2021 Fine-Grained Angular Contrastive Learning With Coarse Labels CVPR 2021 Black-Box Explanation of Object Detectors via Saliency Maps CVPR 2021 Detector-Free Weakly Supervised Grounding by Separation ICCV 2021 AdaMML: Adaptive Multi-Modal Learning for Efficient Video Recognition ICCV 2021 Look at What I’m Doing: Self-Supervised Spatial Grounding of Narrations in Instructional Videos NIPS 2021 LoGAN: Latent Graph Co-Attention Network for Weakly-Supervised Video Moment Retrieval WACV 2021 Contrast and Mix: Temporal Contrastive Video Domain Adaptation with Background Mixing NIPS 2021 A Broader Study of Cross-Domain Few-Shot Learning ECCV 2020 Auxiliary Task Reweighting for Minimum-data Learning NIPS 2020 AdaShare: Learning What To Share For Efficient Deep Multi-Task Learning NIPS 2020 Universal Domain Adaptation through Self Supervision NIPS 2020 Log-Likelihood Ratio Minimizing Flows: Towards Robust and Quantifiable Neural Distribution Alignment NIPS 2020 Uncertainty-Aware Learning for Zero-Shot Semantic Segmentation NIPS 2020 MULE: Multimodal Universal Language Embedding AAAI 2020 COCO-FUNIT: Few-Shot Unsupervised Image Translation with a Content Conditioned Style Encoder ECCV 2020 Learning to Scale Multilingual Representations for Vision-Language Tasks ECCV 2020 Domain2Vec: Domain Embedding for Unsupervised Domain Adaptation ECCV 2020 AR-Net: Adaptive Frame Resolution for Efficient Action Recognition ECCV 2020 Why do These Match? Explaining the Behavior of Image Similarity Models ECCV 2020 Detecting Cross-Modal Inconsistency to Defend Against Neural Fake News EMNLP 2020 Federated Adversarial Domain Adaptation ICLR 2020 DIPNet: Dynamic Identity Propagation Network for Video Object Segmentation WACV 2020 TwoStreamVAN: Improving Motion Modeling in Video Generation WACV 2020 Domain Agnostic Learning with Disentangled Representations ICML 2019 Semi-Supervised Domain Adaptation via Minimax Entropy ICCV 2019 PuppetGAN: Cross-Domain Image Manipulation by Demonstration ICCV 2019 Language-Conditioned Graph Networks for Relational Reasoning ICCV 2019 Learning Similarity Conditions Without Explicit Supervision ICCV 2019 Strong-Weak Distribution Alignment for Adaptive Object Detection CVPR 2019 Adversarial Self-Defense for Cycle-Consistent GANs NIPS 2019 Multilevel Language and Vision Integration for Text-to-Clip Retrieval AAAI 2019 Are You Looking? Grounding to Multiple Modalities in Vision-and-Language Navigation ACL 2019 Learning Multi-Level Hierarchies with Hindsight ICLR 2019 Moment Matching for Multi-Source Domain Adaptation ICCV 2019 Language Features Matter: Effective Language Representations for Vision-Language Tasks ICCV 2019 Speaker-Follower Models for Vision-and-Language Navigation NIPS 2018 CyCADA: Cycle-Consistent Adversarial Domain Adaptation ICML 2018 Women also Snowboard: Overcoming Bias in Captioning Models ECCV 2018 Toward Driving Scene Understanding: A Dataset for Learning Driver Behavior and Causal Reasoning CVPR 2018 Adversarial Dropout Regularization ICLR 2018 Object Hallucination in Image Captioning EMNLP 2018 Explainable Neural Computation via Stack Neural Module Networks ECCV 2018 Adversarial Discriminative Domain Adaptation CVPR 2017 Learning to Reason: End-To-End Module Networks for Visual Question Answering ICCV 2017 R-C3D: Region Convolutional 3D Network for Temporal Activity Detection ICCV 2017 Top-Down Visual Saliency Guided by Captions CVPR 2017 Captioning Images With Diverse Objects CVPR 2017 Modeling Relationships in Referential Expressions With Compositional Modular Networks CVPR 2017 Learning a visuomotor controller for real world robotic grasping using simulated depth images CORL 2017 MUTT: Metric Unit TesTing for Language Generation Tasks ACL 2016 Deep Compositional Captioning: Describing Novel Object Categories Without Paired Training Data CVPR 2016 Improving LSTM-based Video Description with Linguistic Knowledge Mined from Text EMNLP 2016 Large Scale Visual Recognition through Adaptation using Joint Representation and Multiple Instance Learning JMLR 2016 Natural Language Object Retrieval CVPR 2016 Detector Discovery in the Wild: Joint Multiple Instance and Representation Learning CVPR 2015 Long-Term Recurrent Convolutional Networks for Visual Recognition and Description CVPR 2015 Translating Videos to Natural Language Using Deep Recurrent Neural Networks NAACL 2015 Sequence to Sequence - Video to Text ICCV 2015 Simultaneous Deep Transfer Across Domains and Tasks ICCV 2015 Spatial Semantic Regularisation for Large Scale Object Detection ICCV 2015 Learning Deep Object Detectors From 3D Models ICCV 2015 LSDA: Large Scale Detection through Adaptation NIPS 2014 Confidence-Rated Multiple Instance Boosting for Object Detection CVPR 2014 Integrating Language and Vision to Generate Natural Language Descriptions of Videos in the Wild COLING 2014 Continuous Manifold Based Adaptation for Evolving Visual Domains CVPR 2014 Open-vocabulary Object Retrieval RSS 2014 YouTube2Text: Recognizing and Describing Arbitrary Activities Using Semantic Hierarchies and Zero-Shot Recognition ICCV 2013 Semi-supervised Domain Adaptation with Instance Constraints CVPR 2013 Size Matters: Metric Visual Search Constraints from Monocular Metadata NIPS 2010 Filtering Abstract Senses From Image Search Results NIPS 2009 Unsupervised Learning of Visual Sense Models for Polysemous Words NIPS 2008