Zequn Jie
37 papers · 2016–2026 · 8 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+11 more ↓ Show less ↑
🌍 Conference Polyglot (8) 🐣 Hot Topic Early Bird 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🏃 Academic Marathon (9)
🐣
Hot Topic Early Bird
🐝
Cross-Pollinator
(9)
🌍
Conference Polyglot
(8)
🤝
Dynamic Duo
(16)
🧬
Topic Evolution
💎
Century Club
(36)
🚀
Conference Pioneer
📈
Trend Setter
🗃️
Keyword Collector
(173)
⚡
Prolific Year
(8)
🔥
Unstoppable
(5)
Conferences
CVPR (14)
ECCV (7)
ICCV (5)
AAAI (4)
NIPS (4)
EMNLP (1)
ICML (1)
IJCAI (1)
Top co-authors
Keywords
semantic segmentation
(5)
convolutional neural network
(5)
depth estimation
(5)
object detection
(4)
autonomous driving
(4)
object localization
(4)
reinforcement learning
(3)
weakly supervised learning
(3)
bird eye view
(2)
large multimodal model
(2)
bounding box
(2)
vision-language model
(2)
temporal localization
(2)
segment anything model
(2)
instance segmentation
(2)
visual grounding
(2)
3d object detection
(2)
image segmentation
(2)
video understanding
(2)
video localization
(2)
Papers
X-SAM: From Segment Anything to Any Segmentation
AAAI 2026
CLIP-GS: Unifying Vision-Language Representation with 3D Gaussian Splatting
ICCV 2025
RoboTron-Drive: All-in-One Large Multimodal Model for Autonomous Driving
ICCV 2025
AlignSAM: Aligning Segment Anything Model to Open Context via Reinforcement Learning
CVPR 2024
Lumen: Unleashing Versatile Vision-Centric Capabilities of Large Multimodal Models
NIPS 2024
Instance-Aware Multi-Camera 3D Object Detection with Structural Priors Mining and Self-Boosting Learning
AAAI 2024
InstaGen: Enhancing Object Detection by Training on Synthetic Dataset
CVPR 2024
Investigating Compositional Challenges in Vision-Language Models for Visual Grounding
CVPR 2024
Making Large Language Models Better Planners with Reasoning-Decision Alignment
ECCV 2024
3D Weakly Supervised Semantic Segmentation with 2D Vision-Language Guidance
ECCV 2024
AeDet: Azimuth-Invariant Multi-View 3D Object Detection
CVPR 2023
MSMDFusion: Fusing LiDAR and Camera at Multiple Scales With Multi-Depth Seeds for 3D Object Detection
CVPR 2023
Curriculum Multi-Negative Augmentation for Debiased Video Grounding
AAAI 2023
Expansion and Shrinkage of Localization for Weakly-Supervised Semantic Segmentation
NIPS 2022
MORE: Multi-Order RElation Mining for Dense Captioning in 3D Scenes
ECCV 2022
PromptDet: Towards Open-Vocabulary Detection Using Uncurated Images
ECCV 2022
Central Similarity Quantization for Efficient Image and Video Retrieval
CVPR 2020
MTL-NAS: Task-Agnostic Neural Architecture Search Towards General-Purpose Multi-Task Learning
CVPR 2020
NMS by Representative Region: Towards Crowded Pedestrian Detection by Proposal Pairing
CVPR 2020
Localizing Natural Language in Videos
AAAI 2019
Geometry-Aware Distillation for Indoor Semantic Segmentation
CVPR 2019
A Sufficient Condition for Convergences of Adam and RMSProp
CVPR 2019
Left-Right Comparative Recurrent Model for Stereo Matching
CVPR 2018
Policy Optimization with Demonstrations
ICML 2018
Modeling Varying Camera-IMU Time Offset in Optimization-Based Visual-Inertial Odometry
ECCV 2018
Joint Task-Recursive Learning for Semantic Segmentation and Depth Estimation
ECCV 2018
Modular Generative Adversarial Networks
ECCV 2018
Image-level to Pixel-wise Labeling: From Theory to Practice
IJCAI 2018
Temporally Grounding Natural Sentence in Video
EMNLP 2018
Revisiting Dilated Convolution: A Simple Approach for Weakly- and Semi-Supervised Semantic Segmentation
CVPR 2018
Deep Self-Taught Learning for Weakly Supervised Object Localization
CVPR 2017
Neural Person Search Machines
ICCV 2017
Predicting Scene Parsing and Motion Dynamics in the Future
NIPS 2017
FoveaNet: Perspective-Aware Urban Scene Parsing
ICCV 2017
Video Scene Parsing With Predictive Feature Learning
ICCV 2017
Tree-Structured Reinforcement Learning for Sequential Object Localization
NIPS 2016
Reversible Recursive Instance-Level Object Segmentation
CVPR 2016