Rogerio Feris
73 papers · 2013–2025 · 12 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+17 more ↓ Show less ↑
🐝 Cross-Pollinator (11) 🏃 Academic Marathon (12) 🌍 Conference Polyglot (12) 🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (7)
🌈
Renaissance Researcher
(7)
🐣
Hot Topic Early Bird
🐝
Cross-Pollinator
(11)
🌟
Keyword Trendsetter Combo
(4)
🏠
Conference Loyalist
(21)
🏆
Keyword Champion
(6)
🏆
Grand Slam
🧬
Topic Evolution
🔬
Deep Specialist
(19)
🤝
Dynamic Duo
(37)
🗃️
Keyword Collector
(252)
❓
The Questioner
(2)
🚀
Conference Pioneer
💎
Century Club
(73)
📈
Trend Setter
🔥
Unstoppable
(9)
⚡
Prolific Year
(16)
Conferences
CVPR (21)
NIPS (14)
ICCV (11)
ECCV (7)
ICLR (6)
INTERSPEECH (4)
WACV (3)
AAAI (2)
ACL (2)
EMNLP (1)
ICML (1)
NAACL (1)
Top co-authors
Keywords
transfer learning
(11)
self-supervised learning
(10)
few-shot learning
(7)
contrastive learning
(7)
neural network
(7)
zero-shot learning
(7)
vision language model
(6)
multimodal learning
(6)
synthetic datum
(6)
video understanding
(5)
representation learning
(5)
domain adaptation
(5)
vision-language model
(5)
video retrieval
(4)
knowledge distillation
(4)
action recognition
(4)
video recognition
(3)
image classification
(3)
image retrieval
(3)
convolutional neural network
(3)
Papers
Teaching VLMs to Localize Specific Objects from In-context Examples
ICCV 2025
CAV-MAE Sync: Improving Contrastive Audio-Visual Mask Autoencoders via Fine-Grained Alignment
CVPR 2025
Self-MoE: Towards Compositional Large Language Models with Self-Specialized Experts
ICLR 2025
M+: Extending MemoryLLM with Scalable Long-Term Memory
ICML 2025
BATCLIP: Bimodal Online Test-Time Adaptation for CLIP
ICCV 2025
Enhancing Few-Shot Vision-Language Classification with Large Multimodal Model Features
ICCV 2025
ConMe: Rethinking Evaluation of Compositional Reasoning for Modern VLMs
NIPS 2024
$\textit{Trans-LoRA}$: towards data-free Transferable Parameter Efficient Finetuning
NIPS 2024
Self-Specialization: Uncovering Latent Expertise within Large Language Models
ACL 2024
LangNav: Language as a Perceptual Representation for Navigation
NAACL 2024
Whisper-Flamingo: Integrating Visual Features into Whisper for Audio-Visual Speech Recognition and Translation
INTERSPEECH 2024
Improved Techniques for Quantizing Deep Networks With Adaptive Bit-Widths
WACV 2024
What When and Where? Self-Supervised Spatio-Temporal Grounding in Untrimmed Multi-Action Videos from Narrated Instructions
CVPR 2024
CDAC: Cross-domain Attention Consistency in Transformer for Domain Adaptive Semantic Segmentation
ICCV 2023
Addressing Feature Suppression in Unsupervised Visual Representations
WACV 2023
CODA-Prompt: COntinual Decomposed Attention-Based Prompting for Rehearsal-Free Continual Learning
CVPR 2023
ConStruct-VL: Data-Free Continual Structured VL Concepts Learning
CVPR 2023
Teaching Structured Vision & Language Concepts to Vision & Language Models
CVPR 2023
Incorporating Structured Representations into Pretrained Vision & Language Models Using Scene Graphs
EMNLP 2023
Going Beyond Nouns With Vision & Language Models Using Synthetic Data
ICCV 2023
Learning to Grow Pretrained Models for Efficient Transformer Training
ICLR 2023
Multitask Prompt Tuning Enables Parameter-Efficient Transfer Learning
ICLR 2023
Comparison of Multilingual Self-Supervised and Weakly-Supervised Speech Pre-Training for Adaptation to Unseen Languages
INTERSPEECH 2023
Select, Label, and Mix: Learning Discriminative Invariant Feature Representations for Partial Domain Adaptation
WACV 2023
Learning Human Action Recognition Representations Without Real Humans
NIPS 2023
Dense and Aligned Captions (DAC) Promote Compositional Reasoning in VL Models
NIPS 2023
MAtch, eXpand and Improve: Unsupervised Finetuning for Zero-Shot Action Recognition with Language Knowledge
ICCV 2023
Synthetic Pre-Training Tasks for Neural Machine Translation
ACL 2023
LaFTer: Label-Free Tuning of Zero-shot Classifier using Language and Unlabeled Image Collections
NIPS 2023
Procedural Image Programs for Representation Learning
NIPS 2022
FETA: Towards Specializing Foundational Models for Expert Task Applications
NIPS 2022
How Transferable are Video Representations Based on Synthetic Data?
NIPS 2022
Dynamic Network Quantization for Efficient Video Inference
ICCV 2021
StarNet: towards Weakly Supervised Few-Shot Object Detection
AAAI 2021
NASTransfer: Analyzing Architecture Transferability in Large Scale Neural Architecture Search
AAAI 2021
A Broad Study on the Transferability of Visual Representations With Contrastive Learning
ICCV 2021
AdaMML: Adaptive Multi-Modal Learning for Efficient Video Recognition
ICCV 2021
Cascaded Multilingual Audio-Visual Learning from Videos
INTERSPEECH 2021
AVLnet: Learning Audio-Visual Language Representations from Instructional Videos
INTERSPEECH 2021
Detector-Free Weakly Supervised Grounding by Separation
ICCV 2021
Separating Skills and Concepts for Novel Visual Question Answering
CVPR 2021
Fashion IQ: A New Dataset Towards Retrieving Images by Natural Language Feedback
CVPR 2021
Deep Analysis of CNN-Based Spatio-Temporal Representations for Action Recognition
CVPR 2021
Semi-Supervised Action Recognition With Temporal Contrastive Learning
CVPR 2021
Fine-Grained Angular Contrastive Learning With Coarse Labels
CVPR 2021
Spoken Moments: Learning Joint Audio-Visual Representations From Video Descriptions
CVPR 2021
Multimodal Clustering Networks for Self-Supervised Learning From Unlabeled Videos
ICCV 2021
Dynamic Distillation Network for Cross-Domain Few-Shot Recognition with Unlabeled Data
NIPS 2021
IA-RED$^2$: Interpretability-Aware Redundancy Reduction for Vision Transformers
NIPS 2021
VA-RED$^2$: Video Adaptive Redundancy Reduction
ICLR 2021
AdaFuse: Adaptive Temporal Fusion Network for Efficient Action Recognition
ICLR 2021
A Broader Study of Cross-Domain Few-Shot Learning
ECCV 2020
AdaShare: Learning What To Share For Efficient Deep Multi-Task Learning
NIPS 2020
Video Instance Segmentation Tracking With a Modified VAE Architecture
CVPR 2020
Differential Treatment for Stuff and Things: A Simple Unsupervised Domain Adaptation Method for Semantic Segmentation
CVPR 2020
AR-Net: Adaptive Frame Resolution for Efficient Action Recognition
ECCV 2020
OnlineAugment: Online Data Augmentation with Less Domain Knowledge
ECCV 2020
TAFSSL: Task-Adaptive Feature Sub-Space Learning for few-shot classification
ECCV 2020
We Have So Much In Common: Modeling Semantic Relational Set Abstractions in Videos
ECCV 2020
RepMet: Representative-Based Metric Learning for Classification and Few-Shot Object Detection
CVPR 2019
SpotTune: Transfer Learning Through Adaptive Fine-Tuning
CVPR 2019
LaSO: Label-Set Operations Networks for Multi-Label Few-Shot Learning
CVPR 2019
Big-Little Net: An Efficient Multi-Scale Feature Representation for Visual and Speech Recognition
ICLR 2019
BlockDrop: Dynamic Inference Paths in Residual Networks
CVPR 2018
Co-regularized Alignment for Unsupervised Domain Adaptation
NIPS 2018
Delta-encoder: an effective sample synthesis method for few-shot object recognition
NIPS 2018
Learning to Separate Object Sounds by Watching Unlabeled Video
ECCV 2018
Revisiting RCNN: On Awakening the Classification Power of Faster RCNN
ECCV 2018
Dialog-based Interactive Image Retrieval
NIPS 2018
Fully-Adaptive Feature Sharing in Multi-Task Networks With Applications in Person Attribute Classification
CVPR 2017
S3Pool: Pooling With Stochastic Spatial Sampling
CVPR 2017
Deep Domain Adaptation for Describing People Based on Fine-Grained Clothing Attributes
CVPR 2015
Efficient Maximum Appearance Search for Large-Scale Object Detection
CVPR 2013