Shilong Liu
30 papers · 2021–2026 · 8 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+9 more ↓ Show less ↑
π Cross-Pollinator (14) π Interdisciplinary Bridge π§ Keyword Pioneer π Conference Polyglot (8) π Academic Marathon (5)
π
Renaissance Researcher
(6)
πΊοΈ
Taxonomy Completionist
(36)
π
Interdisciplinary Bridge
π§¬
Topic Evolution
π€
Dynamic Duo
(22)
β‘
Prolific Year
(12)
π
Century Club
(28)
ποΈ
Keyword Collector
(99)
π₯
Unstoppable
(5)
Conferences
CVPR (8)
ECCV (6)
ICLR (5)
ICCV (4)
AAAI (3)
NIPS (2)
ACL (1)
EMNLP (1)
Top co-authors
Keywords
object detection
(6)
semantic segmentation
(4)
instance segmentation
(3)
transformer architecture
(3)
deformable attention
(3)
image segmentation
(3)
multimodal learning
(2)
visual grounding
(2)
panoptic segmentation
(2)
detection transformer
(2)
open-vocabulary segmentation
(2)
agent system
(2)
information retrieval
(1)
pose estimation
(1)
in-context learning
(1)
attention mechanism
(1)
transfer learning
(1)
self-supervised learning
(1)
multi-view fusion
(1)
visual reasoning
(1)
Papers
AMS-IO-Bench and AMS-IO-Agent: Benchmarking and Structured Reasoning for Analog and Mixed-Signal Integrated Circuit Input/Output Design
AAAI 2026
SegDINO3D: 3D Instance Segmentation Empowered by Both Image-Level and Object-Level 2D Features
AAAI 2026
Argus: Vision-Centric Reasoning with Grounded Chain-of-Thought
CVPR 2025
CRAB: Cross-environment Agent Benchmark for Multimodal Language Model Agents
ACL 2025
Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection
ECCV 2024
TAPTRv2: Attention-based Position Update Improves Tracking Any Point
NIPS 2024
Visual In-Context Prompting
CVPR 2024
TAPTR: Tracking Any Point with Transformers as Detection
ECCV 2024
T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy
ECCV 2024
LLaVA-Grounding: Grounded Visual Chat with Large Multimodal Models
ECCV 2024
Interfacing Foundation Models' Embeddings
NIPS 2024
LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents
ECCV 2024
Segment and Recognize Anything at Any Granularity
ECCV 2024
MMedAgent: Learning to Use Medical Tools with Multi-modal Agent
EMNLP 2024
TOSS: High-quality Text-guided Novel View Synthesis from a Single Image
ICLR 2024
InstructPix2NeRF: Instructed 3D Portrait Editing from a Single Image
ICLR 2024
DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection
ICLR 2023
Detection Transformer with Stable Matching
ICCV 2023
Neural Interactive Keypoint Detection
ICCV 2023
Explicit Box Detection Unifies End-to-End Multi-Person Pose Estimation
ICLR 2023
DQ-DETR: Dual Query Detection Transformer for Phrase Extraction and Grounding
AAAI 2023
A Simple Framework for Open-Vocabulary Segmentation and Detection
ICCV 2023
Mask DINO: Towards a Unified Transformer-Based Framework for Object Detection and Segmentation
CVPR 2023
PREIM3D: 3D Consistent Precise Image Attribute Editing From a Single Image
CVPR 2023
MP-Former: Mask-Piloted Transformer for Image Segmentation
CVPR 2023
Lite DETR: An Interleaved Multi-Scale Encoder for Efficient DETR
CVPR 2023
DFA3D: 3D Deformable Attention For 2D-to-3D Feature Lifting
ICCV 2023
DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR
ICLR 2022
DN-DETR: Accelerate DETR Training by Introducing Query DeNoising
CVPR 2022
Unsupervised Part Segmentation Through Disentangling Appearance and Shape
CVPR 2021