Cihang Xie
58 papers · 2017–2026 · 11 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+15 more ↓ Show less ↑
π Renaissance Researcher (5) π Interdisciplinary Bridge π Conference Polyglot (11) π Academic Marathon (9) πΊοΈ Taxonomy Completionist (79)
πΊοΈ
Taxonomy Completionist
(79)
π§
Keyword Pioneer
π£
Hot Topic Early Bird
π§¬
Topic Evolution
π
Keyword Champion
(2)
π€
Dynamic Duo
(23)
π
Triple Crown
π
Grand Slam
β
The Questioner
(5)
ποΈ
Keyword Collector
(185)
β‘
Prolific Year
(8)
π₯
Unstoppable
(10)
π
Conference Pioneer
π
Century Club
(57)
π
Trend Setter
Conferences
CVPR (17)
ICLR (11)
ECCV (7)
ICCV (7)
NIPS (6)
AAAI (3)
ICML (3)
EACL (1)
EMNLP (1)
NAACL (1)
WACV (1)
Top co-authors
Keywords
adversarial attack
(6)
vision transformer
(6)
image classification
(5)
adversarial example
(5)
multimodal learning
(5)
convolutional neural network
(4)
adversarial robustness
(4)
representation learning
(4)
black-box attack
(3)
adversarial training
(3)
foundation model
(3)
object detection
(3)
visual representation
(3)
masked autoencoder
(3)
state space model
(2)
efficient computing
(2)
diffusion model
(2)
contrastive learning
(2)
feature extraction
(2)
model compression
(2)
Papers
LVM-Lite: Training Large Vision Models with Efficient Sequential Modeling
WACV 2026
STAR-1: Safer Alignment of Reasoning LLMs with 1K Data
AAAI 2026
Adventurer: Optimizing Vision Mamba Architecture Designs for Efficiency
CVPR 2025
Scaling Laws in Patchification: An Image Is Worth 50,176 Tokens And More
ICML 2025
What If We Recaption Billions of Web Images with LLaMA-3?
ICML 2025
MedTrinity-25M: A Large-scale Multimodal Dataset with Multigranular Annotations for Medicine
ICLR 2025
Autoregressive Pretraining with Mamba in Vision
ICLR 2025
HQ-Edit: A High-Quality Dataset for Instruction-based Image Editing
ICLR 2025
OpenVision: A Fully-Open, Cost-Effective Family of Advanced Vision Encoders for Multimodal Learning
ICCV 2025
VideoLLaMB: Long Streaming Video Understanding with Recurrent Memory Bridges
ICCV 2025
ViLBench: A Suite for Vision-Language Process Reward Modeling
EMNLP 2025
Mamba-Reg: Vision Mamba Also Needs Registers
CVPR 2025
Generative Image Layer Decomposition with Visual Effects
CVPR 2025
Sculpting Holistic 3D Representation in Contrastive Language-Image-3D Pre-training
CVPR 2024
Tuning LayerNorm in Attention: Towards Efficient Multi-Modal LLM Finetuning
ICLR 2024
A Semantic Space is Worth 256 Language Descriptions: Make Stronger Segmentation Models with Descriptive Properties
ECCV 2024
Scaling White-Box Transformers for Vision
NIPS 2024
VHELM: A Holistic Evaluation of Vision Language Models
NIPS 2024
Navigation as Attackers Wish? Towards Building Robust Embodied Agents under Federated Learning
NAACL 2024
How Many Unicorns Are in This Image? A Safety Evaluation Benchmark for Vision LLMs
ECCV 2024
Revisiting Adversarial Training at Scale
CVPR 2024
L2B: Learning to Bootstrap Robust Models for Combating Label Noise
CVPR 2024
Rejuvenating image-GPT as Strong Visual Representation Learners
ICML 2024
Localization vs. Semantics: Visual Representations in Unimodal and Multimodal Models
EACL 2024
From Pixels to Objects: A Hierarchical Approach for Part and Object Segmentation Using Local and Global Aggregation
ECCV 2024
Diffusion Models as Masked Autoencoders
ICCV 2023
An Inverse Scaling Law for CLIP Training
NIPS 2023
Practical Disruption of Image Translation Deepfake Networks
AAAI 2023
Masked Autoencoders Enable Efficient Knowledge Distillers
CVPR 2023
SMAUG: Sparse Masked Autoencoder for Efficient Video-Language Pre-Training
ICCV 2023
DistillBEV: Boosting Multi-Camera 3D Object Detection with Cross-Modal Knowledge Distillation
ICCV 2023
Can CNNs Be More Robust Than Transformers?
ICLR 2023
One-Pixel Shortcut: On the Learning Preference of Deep Neural Networks
ICLR 2023
Simulated Adversarial Testing of Face Recognition Models
CVPR 2022
A Simple Data Mixing Prior for Improving Self-Supervised Learning
CVPR 2022
Fast AdvProp
ICLR 2022
Image BERT Pre-training with Online Tokenizer
ICLR 2022
Adversarial Attack on Attackers: Post-Process to Mitigate Black-Box Score-Based Query Attacks
NIPS 2022
Finding Differences Between Transformers and ConvNets Using Counterfactual Simulation Testing
NIPS 2022
In Defense of Image Pre-training for Spatiotemporal Recognition
ECCV 2022
VIP: Unified Certified Detection and Recovery for Patch Attack with Vision Transformers
ECCV 2022
Robust and Accurate Object Detection via Adversarial Learning
CVPR 2021
Are Transformers more robust than CNNs?
NIPS 2021
Calibrating Concepts and Operations: Towards Symbolic Reasoning on Real Images
ICCV 2021
Shape-Texture Debiased Neural Network Training
ICLR 2021
Neural Architecture Search for Lightweight Non-Local Networks
CVPR 2020
Adversarial Examples Improve Image Recognition
CVPR 2020
Universal Physical Camouflage Attacks on Object Detectors
CVPR 2020
Intriguing Properties of Adversarial Training at Scale
ICLR 2020
PatchAttack: A Black-box Texture-based Attack with Reinforcement Learning
ECCV 2020
Regional Homogeneity: Towards Learning Transferable Universal Adversarial Perturbations Against Defenses
ECCV 2020
Learning Transferable Adversarial Examples via Ghost Networks
AAAI 2020
Feature Denoising for Improving Adversarial Robustness
CVPR 2019
Improving Transferability of Adversarial Examples With Input Diversity
CVPR 2019
Mitigating Adversarial Effects Through Randomization
ICLR 2018
Single-Shot Object Detection With Enriched Semantics
CVPR 2018
DeepVoting: A Robust and Explainable Deep Network for Semantic Part Detection Under Partial Occlusion
CVPR 2018
Adversarial Examples for Semantic Segmentation and Object Detection
ICCV 2017