haoqi fan

28 papers · 2016–2025 · 5 conferences · across top CS/AI conferences

Achievements

+9 more ↓

🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (6) 🌍 Conference Polyglot (5) 🏃 Academic Marathon (9) 🗺️ Taxonomy Completionist (52)

🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🌍 Conference Polyglot (5) 🤝 Dynamic Duo (15) 🧬 Topic Evolution ⚡ Prolific Year (7) 💎 Century Club (28) 🔥 Unstoppable (8) 🗃️ Keyword Collector (122)

Conferences

CVPR (15) ICCV (8) NIPS (3) ICLR (1) ICML (1)

Top co-authors

Christoph Feichtenhofer (15) Yanghao Li (9) Jitendra Malik (7) Bo Xiong (6) Kaiming He (6) Karttikeya Mangalam (5) Chao-Yuan Wu (5) Ross Girshick (4) Qinghao Ye (3) Po-Yao Huang (3)

Keywords

vision transformer (7) self-supervised learning (6) video recognition (5) contrastive learning (5) image classification (5) masked autoencoder (4) representation learning (4) video understanding (4) temporal modeling (3) convolutional neural network (3) video classification (3) visual representation (3) unsupervised learning (2) momentum contrast (2) computer vision (2) model scaling (2) transfer learning (2) memory efficiency (2) visual question answering (1) object detection (1)

Papers

LLaVA-Critic: Learning to Evaluate Multimodal Models CVPR 2025 Painting with Words: Elevating Detailed Image Captioning with Benchmark and Alignment Learning ICLR 2025 Classification Done Right for Vision-Language Pre-Training NIPS 2024 The Effectiveness of MAE Pre-Pretraining for Billion-Scale Pretraining ICCV 2023 Castling-ViT: Compressing Self-Attention via Switching Towards Linear-Angular Attention at Vision Transformer Inference CVPR 2023 MAViL: Masked Audio-Video Learners NIPS 2023 Scaling Language-Image Pre-Training via Masking CVPR 2023 Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles ICML 2023 Diffusion Models as Masked Autoencoders ICCV 2023 On the Importance of Asymmetry for Siamese Representation Learning CVPR 2022 MeMViT: Memory-Augmented Multiscale Vision Transformer for Efficient Long-Term Video Recognition CVPR 2022 Reversible Vision Transformers CVPR 2022 Unified Transformer Tracker for Object Tracking CVPR 2022 Masked Feature Prediction for Self-Supervised Visual Pre-Training CVPR 2022 MViTv2: Improved Multiscale Vision Transformers for Classification and Detection CVPR 2022 Masked Autoencoders As Spatiotemporal Learners NIPS 2022 A Large-Scale Study on Unsupervised Spatiotemporal Representation Learning CVPR 2021 Beyond Short Clips: End-to-End Video-Level Learning With Collaborative Memories CVPR 2021 Multiscale Vision Transformers ICCV 2021 HiT: Hierarchical Transformer With Momentum Contrast for Video-Text Retrieval ICCV 2021 Multiview Pseudo-Labeling for Semi-Supervised Learning From Video ICCV 2021 Momentum Contrast for Unsupervised Visual Representation Learning CVPR 2020 Long-Term Feature Banks for Detailed Video Understanding CVPR 2019 Order-Aware Generative Modeling Using the 3D-Craft Dataset ICCV 2019 Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks With Octave Convolution ICCV 2019 SlowFast Networks for Video Recognition ICCV 2019 Stacked Latent Attention for Multimodal Reasoning CVPR 2018 Going Deeper into First-Person Activity Recognition CVPR 2016