Si Liu
93 papers · 2013–2026 · 14 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+16 more ↓ Show less ↑
π Conference Polyglot (13) π Academic Marathon (12) π Interdisciplinary Bridge π§ Keyword Pioneer π£ Hot Topic Early Bird
π
Renaissance Researcher
(10)
π
Academic Marathon
(12)
πΊοΈ
Taxonomy Completionist
(115)
π
Conference Loyalist
(40)
π±
Topic Pioneer
π¬
Deep Specialist
(17)
π§¬
Topic Evolution
π€
Dynamic Duo
(15)
π
Grand Slam
π₯
Mega-Team
(25)
ποΈ
Keyword Collector
(394)
β‘
Prolific Year
(17)
π
Conference Pioneer
π
Century Club
(90)
π₯
Unstoppable
(11)
π
Trend Setter
Conferences
CVPR (40)
ICCV (15)
ECCV (8)
AAAI (7)
ICLR (6)
NIPS (6)
IJCAI (4)
ACL (1)
EMNLP (1)
ICML (1)
JMLR (1)
MICCAI (1)
NSDI (1)
OSDI (1)
Top co-authors
Keywords
semantic segmentation
(9)
object detection
(8)
convolutional neural network
(7)
multimodal learning
(6)
video understanding
(5)
knowledge distillation
(5)
attention mechanism
(4)
weakly supervised learning
(4)
feature extraction
(4)
autonomous driving
(4)
image generation
(4)
generative adversarial network
(3)
reinforcement learning
(3)
multimodal large language model
(3)
representation learning
(3)
object tracking
(3)
human-object interaction
(3)
pose estimation
(3)
object localization
(3)
human-object interaction detection
(3)
Papers
AerialVLA: A Vision-Language-Action Model for Aerial Navigation with Online Dialogue
AAAI 2026
VaccineRAG: Boosting Multimodal Large Language Modelsβ Immunity to Harmful RAG Samples
AAAI 2026
MathCanvas: Intrinsic Visual Chain-of-Thought for Multimodal Mathematical Reasoning
ACL 2026
ViPE: Visual Perception in Parameter Space for Efficient Video-Language Understanding
EMNLP 2025
Unleashing the Temporal-Spatial Reasoning Capacity of GPT for Training-Free Audio and Language Referenced Video Object Segmentation
AAAI 2025
GaussianPainter: Painting Point Cloud into 3D Gaussians with Normal Guidance
AAAI 2025
Towards Realistic UAV Vision-Language Navigation: Platform, Benchmark, and Methodology
ICLR 2025
Mixture Compressor for Mixture-of-Experts LLMs Gains More
ICLR 2025
LLaVA-ST: A Multimodal Large Language Model for Fine-Grained Spatial-Temporal Understanding
CVPR 2025
VideoEspresso: A Large-Scale Chain-of-Thought Dataset for Fine-Grained Video Reasoning via Core Frame Selection
CVPR 2025
Revisiting Audio-Visual Segmentation with Vision-Centric Transformer
CVPR 2025
Generative Map Priors for Collaborative BEV Semantic Segmentation
CVPR 2025
FlexDrive: Toward Trajectory Flexibility in Driving Scene Gaussian Splatting Reconstruction and Rendering
CVPR 2025
LLaVA-MoD: Making LLaVA Tiny via MoE-Knowledge Distillation
ICLR 2025
Point Cluster: A Compact Message Unit for Communication-Efficient Collaborative Perception
ICLR 2025
Video2BEV: Transforming Drone Videos to BEVs for Video-based Geo-localization
ICCV 2025
CoST: Efficient Collaborative Perception From Unified Spatiotemporal Perspective
ICCV 2025
CycleVAR: Repurposing Autoregressive Model for Unsupervised One-Step Image Translation
ICCV 2025
Instruction-Oriented Preference Alignment for Enhancing Multi-Modal Comprehension Capability of MLLMs
ICCV 2025
Image Understanding Makes for A Good Tokenizer for Image Generation
NIPS 2024
Lumina-Next : Making Lumina-T2X Stronger and Faster with Next-DiT
NIPS 2024
Communication-Efficient Collaborative Perception via Information Filling with Codebook
CVPR 2024
SAFDNet: A Simple and Effective Network for Fully Sparse 3D Object Detection
CVPR 2024
EASE-DETR: Easing the Competition among Object Queries
CVPR 2024
Customize your NeRF: Adaptive Source Driven 3D Scene Editing via Local-Global Iterative Training
CVPR 2024
Mask-Enhanced Segment Anything Model for Tumor Lesion Semantic Segmentation
MICCAI 2024
Octavius: Mitigating Task Interference in MLLMs via LoRA-MoE
ICLR 2024
ReSimAD: Zero-Shot 3D Domain Transfer for Autonomous Driving with Source Reconstruction and Target Simulation
ICLR 2024
Asynchronous Large Language Model Enhanced Planner for Autonomous Driving
ECCV 2024
Global-Local Collaborative Inference with LLM for Lidar-Based Open-Vocabulary Detection
ECCV 2024
Controllable Navigation Instruction Generation with Chain of Thought Prompting
ECCV 2024
LaMI-DETR: Open-Vocabulary Detection with Language Model Instruction
ECCV 2024
FouriScale: A Frequency Perspective on Training-Free High-Resolution Image Synthesis
ECCV 2024
Learning Background Prompts to Discover Implicit Knowledge for Open Vocabulary Object Detection
CVPR 2024
CooHOI: Learning Cooperative Human-Object Interaction with Manipulated Object Dynamics
NIPS 2024
Optimizing the Placement of Roadside LiDARs for Autonomous Driving
ICCV 2023
Boosting Verification of Deep Reinforcement Learning via Piece-Wise Linear Decision Neural Networks
NIPS 2023
MARBLE: Music Audio Representation Benchmark for Universal Evaluation
NIPS 2023
Boosting Verified Training for Robust Image Classifications via Abstraction
CVPR 2023
Object-Aware Distillation Pyramid for Open-Vocabulary Object Detection
CVPR 2023
Bridging Search Region Interaction With Template for RGB-T Tracking
CVPR 2023
Adaptive Zone-Aware Hierarchical Planner for Vision-Language Navigation
CVPR 2023
Improving Weakly Supervised Temporal Action Localization by Bridging Train-Test Gap in Pseudo Labels
CVPR 2023
DETR With Additional Global Aggregation for Cross-Domain Weakly Supervised Object Detection
CVPR 2023
Anchor3DLane: Learning To Regress 3D Anchors for Monocular 3D Lane Detection
CVPR 2023
Omnidirectional Information Gathering for Knowledge Transfer-Based Audio-Visual Navigation
ICCV 2023
Video Background Music Generation: Dataset, Method and Evaluation
ICCV 2023
Object as Query: Lifting Any 2D Object Detector to 3D Detection
ICCV 2023
Discovering Sounding Objects by Audio Queries for Audio Visual Segmentation
IJCAI 2023
Enriching Phrases with Coupled Pixel and Object Contexts for Panoptic Narrative Grounding
IJCAI 2023
RHINE: Robust and High-performance Internet Naming with E2E Authenticity
NSDI 2023
Detecting Transactional Bugs in Database Engines via Graph-Based Oracle Construction
OSDI 2023
PAC Guarantees and Effective Algorithms for Detecting Novel Categories
JMLR 2022
HEAD: HEtero-Assists Distillation for Heterogeneous Object Detectors
ECCV 2022
PoseTrans: A Simple yet Effective Pose Transformation Augmentation for Human Pose Estimation
ECCV 2022
Distribution-Aware Single-Stage Models for Multi-Person 3D Pose Estimation
CVPR 2022
Reinforced Structured State-Evolution for Vision-Language Navigation
CVPR 2022
GEN-VLKT: Simplify Association and Enhance Interaction Understanding for HOI Detection
CVPR 2022
Language-Bridged Spatial-Temporal Interaction for Referring Video Object Segmentation
CVPR 2022
3D-SPS: Single-Stage 3D Visual Grounding via Referred Point Progressive Selection
CVPR 2022
Language-Guided Global Image Editing via Cross-Modal Cyclic Mechanism
ICCV 2021
General Instance Distillation for Object Detection
CVPR 2021
Mining the Benefits of Two-stage and One-stage HOI Detection
NIPS 2021
Room-and-Object Aware Knowledge Reasoning for Remote Embodied Referring Expression
CVPR 2021
Reformulating HOI Detection As Adaptive Set Prediction
CVPR 2021
Differentiable Multi-Granularity Human Representation Learning for Instance-Aware Human Semantic Parsing
CVPR 2021
Confidence-aware Non-repetitive Multimodal Transformers for TextCaps
AAAI 2021
Collaborative Spatial-Temporal Modeling for Language-Queried Video Actor Segmentation
CVPR 2021
Linguistic Structure Guided Context Modeling for Referring Image Segmentation
ECCV 2020
A Real-Time Cross-Modality Correlation Filtering Method for Referring Expression Comprehension
CVPR 2020
Referring Image Segmentation via Cross-Modal Progressive Comprehension
CVPR 2020
Tree-Structured Policy Based Progressive Reinforcement Learning for Temporally Language Grounding in Video
AAAI 2020
PSGAN: Pose and Expression Robust Spatial-Aware GAN for Customizable Makeup Transfer
CVPR 2020
AdversarialNAS: Adversarial Neural Architecture Search for GANs
CVPR 2020
PPDM: Parallel Point Detection and Matching for Real-Time Human-Object Interaction Detection
CVPR 2020
Rule-Guided Compositional Representation Learning on Knowledge Graphs
AAAI 2020
RGB-Infrared Cross-Modality Person Re-Identification via Joint Pixel and Feature Alignment
ICCV 2019
Building Detail-Sensitive Semantic Segmentation Networks With Polynomial Pooling
CVPR 2019
Open Category Detection with PAC Guarantees
ICML 2018
Ensemble Soft-Margin Softmax Loss for Image Classification
IJCAI 2018
Surveillance Video Parsing With Single Frame Supervision
CVPR 2017
Learning Adaptive Receptive Fields for Deep Image Parsing Network
CVPR 2017
Makeup Like a Superstar: Deep Localized Makeup Transfer Network
IJCAI 2016
SketchNet: Sketch Classification With Web Images
CVPR 2016
Structural Correlation Filter for Robust Visual Tracking
CVPR 2016
Matching-CNN Meets KNN: Quasi-Parametric Human Parsing
CVPR 2015
Structural Sparse Tracking
CVPR 2015
Diversity-Induced Multi-View Subspace Clustering
CVPR 2015
Low-Rank Tensor Constrained Multiview Subspace Clustering
ICCV 2015
Human Parsing With Contextualized Convolutional Neural Network
ICCV 2015
Towards Computational Baby Learning: A Weakly-Supervised Approach for Object Detection
ICCV 2015
Low-Rank Sparse Coding for Image Classification
ICCV 2013
SYM-FISH: A Symmetry-Aware Flip Invariant Sketch Histogram Shape Descriptor
ICCV 2013