Xinlong Wang

40 papers · 2018–2025 · 8 conferences · across top CS/AI conferences

Achievements

+12 more ↓

🌍 Conference Polyglot (8) 🏃 Academic Marathon (7) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🐝 Cross-Pollinator (10)

🐝 Cross-Pollinator (10) 🌈 Renaissance Researcher (5) 🗺️ Taxonomy Completionist (61) 🧬 Topic Evolution 👑 Triple Crown 🤝 Dynamic Duo (20) 🏆 Grand Slam 📈 Trend Setter 🗃️ Keyword Collector (137) 🔥 Unstoppable (8) 💎 Century Club (40) ⚡ Prolific Year (5)

Conferences

CVPR (13) ICLR (7) NIPS (6) ECCV (5) AAAI (3) ICCV (3) ACL (2) ICML (1)

Top co-authors

Chunhua Shen (20) Tiejun Huang (8) Yueze Wang (6) Yufeng Cui (6) Fan Zhang (5) Quan Sun (5) Lei Li (4) Tao Kong (4) Haiwen Diao (4) Zhi Tian (4)

Research topics

Core AI (1)

Keywords

object detection (7) semantic segmentation (7) instance segmentation (7) in-context learning (5) knowledge distillation (4) self-supervised learning (4) few-shot learning (4) image segmentation (3) vision-language model (3) zero-shot learning (3) mask prediction (3) multimodal learning (3) multi-modal learning (3) image generation (2) foundation model (2) visual prompting (2) weakly supervised learning (2) video understanding (2) object localization (2) convolutional neural network (2)

Papers

Autoregressive Video Generation without Vector Quantization ICLR 2025 JudgeLM: Fine-tuned Large Language Models are Scalable Judges ICLR 2025 EVEv2: Improved Baselines for Encoder-Free Vision-Language Models ICCV 2025 You See it, You Got it: Learning 3D Creation on Pose-Free Videos at Scale CVPR 2025 Diffusion Feedback Helps CLIP See Better ICLR 2025 Beyond Literal Descriptions: Understanding and Locating Open-World Objects Aligned with Human Intentions ACL 2024 Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model ICML 2024 DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal Perception NIPS 2024 A Simple Image Segmentation Framework via In-Context Examples NIPS 2024 Unleashing the Potential of the Diffusion Model in Few-shot Semantic Segmentation NIPS 2024 Unveiling Encoder-Free Vision-Language Models NIPS 2024 Unveiling Parts Beyond Objects: Towards Finer-Granularity Referring Expression Segmentation CVPR 2024 Generative Multimodal Models are In-Context Learners CVPR 2024 CapsFusion: Rethinking Image-Text Data at Scale CVPR 2024 Tokenize Anything via Prompting ECCV 2024 Region-Native Visual Tokenization ECCV 2024 Emu: Generative Pretraining in Multimodality ICLR 2024 Uni3D: Exploring Unified 3D Representation at Scale ICLR 2024 Matcher: Segment Anything with One Shot Using All-Purpose Feature Matching ICLR 2024 EVA: Exploring the Limits of Masked Visual Representation Learning at Scale CVPR 2023 Conditional Positional Encodings for Vision Transformers ICLR 2023 Fine-Grained Visual Prompting NIPS 2023 Towards Better Entity Linking with Multi-View Enhanced Distillation ACL 2023 SegGPT: Towards Segmenting Everything in Context ICCV 2023 Affective Image Filter: Reflecting Emotions from Text to Images ICCV 2023 Point-Teaching: Weakly Semi-supervised Object Detection with Point Annotations AAAI 2023 Images Speak in Images: A Generalist Painter for In-Context Visual Learning CVPR 2023 Poseur: Direct Human Pose Regression with Transformers ECCV 2022 FreeSOLO: Learning To Segment Objects Without Annotations CVPR 2022 FCPose: Fully Convolutional Multi-Person Pose Estimation With Dynamic Instance-Aware Convolutions CVPR 2021 Diverse Knowledge Distillation for End-to-End Person Search AAAI 2021 Dense Contrastive Learning for Self-Supervised Visual Pre-Training CVPR 2021 BoxInst: High-Performance Instance Segmentation With Box Annotations CVPR 2021 End-to-End Video Instance Segmentation With Transformers CVPR 2021 SOLOv2: Dynamic and Fast Instance Segmentation NIPS 2020 Task-Aware Monocular Depth Estimation for 3D Object Detection AAAI 2020 SOLO: Segmenting Objects by Locations ECCV 2020 Instance-Aware Embedding for Point Cloud Instance Segmentation ECCV 2020 Associatively Segmenting Instances and Semantics in Point Clouds CVPR 2019 Repulsion Loss: Detecting Pedestrians in a Crowd CVPR 2018