Xiaohua Zhai

31 papers · 2019–2024 · 7 conferences · across top CS/AI conferences

Achievements

+12 more ↓

🌈 Renaissance Researcher (6) 🌉 Interdisciplinary Bridge 🏃 Academic Marathon (5) 🌍 Conference Polyglot (7) 🗺️ Taxonomy Completionist (50)

🌍 Conference Polyglot (7) 🏃 Academic Marathon (5) 🗺️ Taxonomy Completionist (50) 👥 Mega-Team (43) 👑 Triple Crown 🤝 Dynamic Duo (22) 📈 Trend Setter 🔥 Unstoppable (6) 💎 Century Club (31) 🗃️ Keyword Collector (96) ⚡ Prolific Year (8) ❓ The Questioner

Conferences

NIPS (9) CVPR (8) ECCV (4) ICML (4) ICLR (3) ICCV (2) JMLR (1)

Top co-authors

Lucas Beyer (22) Alexander Kolesnikov (17) Neil Houlsby (14) Xiao Wang (9) Mario Lucic (9) Matthias Minderer (7) Michael Tschannen (7) Ibrahim Alabdulmohsin (6) Basil Mustafa (6) Andreas Steiner (5)

Research topics

Reinforcement Learning (1)

Keywords

image classification (7) vision transformer (5) contrastive learning (5) model scaling (4) self-supervised learning (4) transfer learning (4) vision-language model (4) convolutional neural network (3) image captioning (3) generative adversarial network (3) representation learning (3) domain generalization (2) unsupervised learning (2) zero-shot transfer (2) neural network (2) few-shot learning (2) computer vision (2) semi-supervised learning (2) distribution shift (2) policy optimization (1)

Papers

LocCa: Visual Pretraining with Location-aware Captioners NIPS 2024 On Scaling Up a Multilingual Vision and Language Model CVPR 2024 CLIP the Bias: How Useful is Balancing Data in Multimodal Learning? ICLR 2024 No Filter: Cultural and Socioeconomic Diversity in Contrastive Vision-Language Models NIPS 2024 SILC: Improving Vision Language Pretraining with Self-Distillation ECCV 2024 Scaling Vision Transformers to 22 Billion Parameters ICML 2023 PaLI: A Jointly-Scaled Multilingual Language-Image Model ICLR 2023 Getting ViT in Shape: Scaling Laws for Compute-Optimal Model Design NIPS 2023 Tuning Computer Vision Models With Task Rewards ICML 2023 Three Towers: Flexible Contrastive Learning with Pretrained Image Models NIPS 2023 FlexiViT: One Model for All Patch Sizes CVPR 2023 Sigmoid Loss for Language Image Pre-Training ICCV 2023 Image Captioners Are Scalable Vision Learners Too NIPS 2023 Underspecification Presents Challenges for Credibility in Modern Machine Learning JMLR 2022 Revisiting Neural Scaling Laws in Language and Vision NIPS 2022 UViM: A Unified Modeling Approach for Vision with Learned Guiding Codes NIPS 2022 LiT: Zero-Shot Transfer With Locked-Image Text Tuning CVPR 2022 Knowledge Distillation: A Good Teacher Is Patient and Consistent CVPR 2022 Scaling Vision Transformers CVPR 2022 A Simple Single-Scale Vision Transformer for Object Detection and Instance Segmentation ECCV 2022 Simple Open-Vocabulary Object Detection with Vision Transformers ECCV 2022 An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale ICLR 2021 On Robustness and Transferability of Convolutional Neural Networks CVPR 2021 MLP-Mixer: An all-MLP Architecture for Vision NIPS 2021 Revisiting the Calibration of Modern Neural Networks NIPS 2021 Big Transfer (BiT): General Visual Representation Learning ECCV 2020 A Large-Scale Study on Regularization and Normalization in GANs ICML 2019 Self-Supervised GANs via Auxiliary Rotation Loss CVPR 2019 S4L: Self-Supervised Semi-Supervised Learning ICCV 2019 High-Fidelity Image Generation With Fewer Labels ICML 2019 Revisiting Self-Supervised Visual Representation Learning CVPR 2019