Leonid Karlinsky

53 papers · 2010–2025 · 10 conferences · across top CS/AI conferences

Achievements

+14 more ↓

🧭 Keyword Pioneer 🗺️ Taxonomy Completionist (11) 🌈 Renaissance Researcher (6) 🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (10)

🗺️ Taxonomy Completionist (11) 🏃 Academic Marathon (15) 🧭 Keyword Pioneer 🌟 Keyword Trendsetter Combo (3) 🧬 Topic Evolution 🔬 Deep Specialist (15) 🏆 Keyword Champion (2) 🤝 Dynamic Duo (37) 🗃️ Keyword Collector (183) ❓ The Questioner 📈 Trend Setter 🔥 Unstoppable (9) ⚡ Prolific Year (6) 💎 Century Club (53)

Conferences

NIPS (13) CVPR (11) ICCV (7) ICLR (7) ECCV (6) INTERSPEECH (3) ACL (2) EMNLP (2) AAAI (1) WACV (1)

Top co-authors

Rogerio Feris (37) Rameswar Panda (18) Assaf Arbelle (15) Hilde Kuehne (12) Raja Giryes (12) Eli Schwartz (11) Sivan Doveh (10) Roei Herzig (10) Kate Saenko (9) Sivan Harary (8)

Keywords

few-shot learning (9) transfer learning (9) vision language model (6) self-supervised learning (6) vision-language model (6) zero-shot learning (6) synthetic datum (5) action recognition (4) multimodal learning (4) contrastive learning (4) compositional reasoning (3) weakly supervised learning (3) large language model (3) domain adaptation (3) representation learning (3) domain generalization (2) image classification (2) one-shot learning (2) vision transformer (2) object recognition (2)

Papers

REAL-MM-RAG: A Real-World Multi-Modal Retrieval Benchmark ACL 2025 LiveXiv - A Multi-Modal live benchmark based on Arxiv papers content ICLR 2025 Self-MoE: Towards Compositional Large Language Models with Self-Specialized Experts ICLR 2025 Teaching VLMs to Localize Specific Objects from In-context Examples ICCV 2025 Enhancing Few-Shot Vision-Language Classification with Large Multimodal Model Features ICCV 2025 BATCLIP: Bimodal Online Test-Time Adaptation for CLIP ICCV 2025 Sample- and Parameter-Efficient Auto-Regressive Image Models CVPR 2025 CAV-MAE Sync: Improving Contrastive Audio-Visual Mask Autoencoders via Fine-Grained Alignment CVPR 2025 PromptonomyViT: Multi-Task Prompt Learning Improves Video Transformers Using Synthetic Scene Data WACV 2024 Multimodal Task Vectors Enable Many-Shot Multimodal In-Context Learning NIPS 2024 ConMe: Rethinking Evaluation of Compositional Reasoning for Modern VLMs NIPS 2024 $\textit{Trans-LoRA}$: towards data-free Transferable Parameter Efficient Finetuning NIPS 2024 Self-Specialization: Uncovering Latent Expertise within Large Language Models ACL 2024 Meta-Prompting for Automating Zero-shot Visual Recognition with LLMs ECCV 2024 NumeroLogic: Number Encoding for Enhanced LLMs’ Numerical Reasoning EMNLP 2024 Listen, Think, and Understand ICLR 2024 Whisper-Flamingo: Integrating Visual Features into Whisper for Audio-Visual Speech Recognition and Translation INTERSPEECH 2024 Learning to Grow Pretrained Models for Efficient Transformer Training ICLR 2023 CODA-Prompt: COntinual Decomposed Attention-Based Prompting for Rehearsal-Free Continual Learning CVPR 2023 ConStruct-VL: Data-Free Continual Structured VL Concepts Learning CVPR 2023 Teaching Structured Vision & Language Concepts to Vision & Language Models CVPR 2023 LaFTer: Label-Free Tuning of Zero-shot Classifier using Language and Unlabeled Image Collections NIPS 2023 Comparison of Multilingual Self-Supervised and Weakly-Supervised Speech Pre-Training for Adaptation to Unseen Languages INTERSPEECH 2023 Incorporating Structured Representations into Pretrained Vision & Language Models Using Scene Graphs EMNLP 2023 Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong General Audio Event Taggers INTERSPEECH 2023 Going Beyond Nouns With Vision & Language Models Using Synthetic Data ICCV 2023 MAtch, eXpand and Improve: Unsupervised Finetuning for Zero-Shot Action Recognition with Language Knowledge ICCV 2023 Multitask Prompt Tuning Enables Parameter-Efficient Transfer Learning ICLR 2023 Contrastive Audio-Visual Masked Autoencoder ICLR 2023 Dense and Aligned Captions (DAC) Promote Compositional Reasoning in VL Models NIPS 2023 Learning Human Action Recognition Representations Without Real Humans NIPS 2023 Unsupervised Domain Generalization by Learning a Bridge Across Domains CVPR 2022 Bringing Image Scene Structure to Video via Frame-Clip Consistency of Object Tokens NIPS 2022 Self-Supervised Classification Network ECCV 2022 FETA: Towards Specializing Foundational Models for Expert Task Applications NIPS 2022 Task2Sim: Towards Effective Pre-Training and Transfer From Synthetic Data CVPR 2022 How Transferable are Video Representations Based on Synthetic Data? NIPS 2022 A Broad Study on the Transferability of Visual Representations With Contrastive Learning ICCV 2021 AdaFuse: Adaptive Temporal Fusion Network for Efficient Action Recognition ICLR 2021 StarNet: towards Weakly Supervised Few-Shot Object Detection AAAI 2021 Fine-Grained Angular Contrastive Learning With Coarse Labels CVPR 2021 Detector-Free Weakly Supervised Grounding by Separation ICCV 2021 Dynamic Distillation Network for Cross-Domain Few-Shot Recognition with Unlabeled Data NIPS 2021 AR-Net: Adaptive Frame Resolution for Efficient Action Recognition ECCV 2020 OnlineAugment: Online Data Augmentation with Less Domain Knowledge ECCV 2020 TAFSSL: Task-Adaptive Feature Sub-Space Learning for few-shot classification ECCV 2020 A Broader Study of Cross-Domain Few-Shot Learning ECCV 2020 RepMet: Representative-Based Metric Learning for Classification and Few-Shot Object Detection CVPR 2019 LaSO: Label-Set Operations Networks for Multi-Label Few-Shot Learning CVPR 2019 Co-regularized Alignment for Unsupervised Domain Adaptation NIPS 2018 Delta-encoder: an effective sample synthesis method for few-shot object recognition NIPS 2018 Fine-Grained Recognition of Thousands of Object Categories With Single-Example Training CVPR 2017 Using body-anchored priors for identifying actions in single images NIPS 2010