Leonid Karlinsky
53 papers · 2010–2025 · 10 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+14 more ↓ Show less ↑
π§ Keyword Pioneer πΊοΈ Taxonomy Completionist (11) π Renaissance Researcher (6) π Interdisciplinary Bridge π Conference Polyglot (10)
πΊοΈ
Taxonomy Completionist
(11)
π
Academic Marathon
(15)
π§
Keyword Pioneer
π
Keyword Trendsetter Combo
(3)
π§¬
Topic Evolution
π¬
Deep Specialist
(15)
π
Keyword Champion
(2)
π€
Dynamic Duo
(37)
ποΈ
Keyword Collector
(183)
β
The Questioner
π
Trend Setter
π₯
Unstoppable
(9)
β‘
Prolific Year
(6)
π
Century Club
(53)
Conferences
NIPS (13)
CVPR (11)
ICCV (7)
ICLR (7)
ECCV (6)
INTERSPEECH (3)
ACL (2)
EMNLP (2)
AAAI (1)
WACV (1)
Top co-authors
Keywords
few-shot learning
(9)
transfer learning
(9)
vision language model
(6)
self-supervised learning
(6)
vision-language model
(6)
zero-shot learning
(6)
synthetic datum
(5)
action recognition
(4)
multimodal learning
(4)
contrastive learning
(4)
compositional reasoning
(3)
weakly supervised learning
(3)
large language model
(3)
domain adaptation
(3)
representation learning
(3)
domain generalization
(2)
image classification
(2)
one-shot learning
(2)
vision transformer
(2)
object recognition
(2)
Papers
REAL-MM-RAG: A Real-World Multi-Modal Retrieval Benchmark
ACL 2025
LiveXiv - A Multi-Modal live benchmark based on Arxiv papers content
ICLR 2025
Self-MoE: Towards Compositional Large Language Models with Self-Specialized Experts
ICLR 2025
Teaching VLMs to Localize Specific Objects from In-context Examples
ICCV 2025
Enhancing Few-Shot Vision-Language Classification with Large Multimodal Model Features
ICCV 2025
BATCLIP: Bimodal Online Test-Time Adaptation for CLIP
ICCV 2025
Sample- and Parameter-Efficient Auto-Regressive Image Models
CVPR 2025
CAV-MAE Sync: Improving Contrastive Audio-Visual Mask Autoencoders via Fine-Grained Alignment
CVPR 2025
PromptonomyViT: Multi-Task Prompt Learning Improves Video Transformers Using Synthetic Scene Data
WACV 2024
Multimodal Task Vectors Enable Many-Shot Multimodal In-Context Learning
NIPS 2024
ConMe: Rethinking Evaluation of Compositional Reasoning for Modern VLMs
NIPS 2024
$\textit{Trans-LoRA}$: towards data-free Transferable Parameter Efficient Finetuning
NIPS 2024
Self-Specialization: Uncovering Latent Expertise within Large Language Models
ACL 2024
Meta-Prompting for Automating Zero-shot Visual Recognition with LLMs
ECCV 2024
NumeroLogic: Number Encoding for Enhanced LLMsβ Numerical Reasoning
EMNLP 2024
Listen, Think, and Understand
ICLR 2024
Whisper-Flamingo: Integrating Visual Features into Whisper for Audio-Visual Speech Recognition and Translation
INTERSPEECH 2024
Learning to Grow Pretrained Models for Efficient Transformer Training
ICLR 2023
CODA-Prompt: COntinual Decomposed Attention-Based Prompting for Rehearsal-Free Continual Learning
CVPR 2023
ConStruct-VL: Data-Free Continual Structured VL Concepts Learning
CVPR 2023
Teaching Structured Vision & Language Concepts to Vision & Language Models
CVPR 2023
LaFTer: Label-Free Tuning of Zero-shot Classifier using Language and Unlabeled Image Collections
NIPS 2023
Comparison of Multilingual Self-Supervised and Weakly-Supervised Speech Pre-Training for Adaptation to Unseen Languages
INTERSPEECH 2023
Incorporating Structured Representations into Pretrained Vision & Language Models Using Scene Graphs
EMNLP 2023
Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong General Audio Event Taggers
INTERSPEECH 2023
Going Beyond Nouns With Vision & Language Models Using Synthetic Data
ICCV 2023
MAtch, eXpand and Improve: Unsupervised Finetuning for Zero-Shot Action Recognition with Language Knowledge
ICCV 2023
Multitask Prompt Tuning Enables Parameter-Efficient Transfer Learning
ICLR 2023
Contrastive Audio-Visual Masked Autoencoder
ICLR 2023
Dense and Aligned Captions (DAC) Promote Compositional Reasoning in VL Models
NIPS 2023
Learning Human Action Recognition Representations Without Real Humans
NIPS 2023
Unsupervised Domain Generalization by Learning a Bridge Across Domains
CVPR 2022
Bringing Image Scene Structure to Video via Frame-Clip Consistency of Object Tokens
NIPS 2022
Self-Supervised Classification Network
ECCV 2022
FETA: Towards Specializing Foundational Models for Expert Task Applications
NIPS 2022
Task2Sim: Towards Effective Pre-Training and Transfer From Synthetic Data
CVPR 2022
How Transferable are Video Representations Based on Synthetic Data?
NIPS 2022
A Broad Study on the Transferability of Visual Representations With Contrastive Learning
ICCV 2021
AdaFuse: Adaptive Temporal Fusion Network for Efficient Action Recognition
ICLR 2021
StarNet: towards Weakly Supervised Few-Shot Object Detection
AAAI 2021
Fine-Grained Angular Contrastive Learning With Coarse Labels
CVPR 2021
Detector-Free Weakly Supervised Grounding by Separation
ICCV 2021
Dynamic Distillation Network for Cross-Domain Few-Shot Recognition with Unlabeled Data
NIPS 2021
AR-Net: Adaptive Frame Resolution for Efficient Action Recognition
ECCV 2020
OnlineAugment: Online Data Augmentation with Less Domain Knowledge
ECCV 2020
TAFSSL: Task-Adaptive Feature Sub-Space Learning for few-shot classification
ECCV 2020
A Broader Study of Cross-Domain Few-Shot Learning
ECCV 2020
RepMet: Representative-Based Metric Learning for Classification and Few-Shot Object Detection
CVPR 2019
LaSO: Label-Set Operations Networks for Multi-Label Few-Shot Learning
CVPR 2019
Co-regularized Alignment for Unsupervised Domain Adaptation
NIPS 2018
Delta-encoder: an effective sample synthesis method for few-shot object recognition
NIPS 2018
Fine-Grained Recognition of Thousands of Object Categories With Single-Example Training
CVPR 2017
Using body-anchored priors for identifying actions in single images
NIPS 2010