AJ Piergiovanni
24 papers · 2018–2025 · 9 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+14 more ↓ Show less ↑
π Interdisciplinary Bridge π Academic Marathon (7) π Conference Polyglot (9) π Renaissance Researcher (7) πΊοΈ Taxonomy Completionist (49)
πΊοΈ
Taxonomy Completionist
(49)
π§
Keyword Pioneer
π£
Hot Topic Early Bird
π€
Dynamic Duo
(17)
π
Grand Slam
π§¬
Topic Evolution
π
Keyword Champion
(2)
π₯
Mega-Team
(43)
ποΈ
Keyword Collector
(96)
π₯
Unstoppable
(8)
π
Trend Setter
β‘
Prolific Year
(8)
π
Conference Pioneer
π
Century Club
(24)
Conferences
CVPR (8)
ECCV (5)
ICLR (3)
ICCV (2)
NIPS (2)
AAAI (1)
CORL (1)
ICML (1)
WACV (1)
Top co-authors
Keywords
video understanding
(4)
action recognition
(4)
representation learning
(3)
transfer learning
(3)
multimodal learning
(3)
video classification
(2)
self-supervised learning
(2)
video representation
(2)
vision-language model
(2)
activity detection
(2)
temporal alignment
(2)
activity recognition
(2)
convolutional neural network
(2)
preference learning
(1)
visual question answering
(1)
few-shot learning
(1)
zero-shot learning
(1)
object detection
(1)
image captioning
(1)
grammar learning
(1)
Papers
VideoComp: Advancing Fine-Grained Compositional and Temporal Alignment in Video-Text Models
CVPR 2025
Mirasol3B: A Multimodal Autoregressive Model for Time-Aligned and Contextual Modalities
CVPR 2024
On Scaling Up a Multilingual Vision and Language Model
CVPR 2024
Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video Learning
CVPR 2023
Open-Vocabulary Object Detection upon Frozen Vision and Language Models
ICLR 2023
PaLI: A Jointly-Scaled Multilingual Language-Image Model
ICLR 2023
FindIt: Generalized Localization with Natural Language Queries
ECCV 2022
Video Question Answering with Iterative Video-Text Co-Tokenization
ECCV 2022
Recognizing Actions in Videos From Unseen Viewpoints
CVPR 2021
4D-Net for Learned Multi-Modal Alignment
ICCV 2021
TokenLearner: Adaptive Space-Time Tokenization for Videos
NIPS 2021
Learning Multimodal Representations for Unseen Activities
WACV 2020
Differentiable Grammars for Videos
AAAI 2020
Evolving Losses for Unsupervised Video Representation Learning
CVPR 2020
Adversarial Generative Grammars for Human Activity Prediction
ECCV 2020
AttentionNAS: Spatiotemporal Attention Cell Search for Video Classification
ECCV 2020
AssembleNet++: Assembling Modality Representations via Attention Connections - Supplementary Material -
ECCV 2020
AssembleNet: Searching for Multi-Stream Neural Connectivity in Video Architectures
ICLR 2020
AViD Dataset: Anonymized Videos from Diverse Countries
NIPS 2020
Temporal Gaussian Mixture Layer for Videos
ICML 2019
Model-based Behavioral Cloning with Future Image Similarity Learning
CORL 2019
Evolving Space-Time Neural Architectures for Videos
ICCV 2019
Representation Flow for Action Recognition
CVPR 2019
Learning Latent Super-Events to Detect Multiple Activities in Videos
CVPR 2018