Rui Qian
33 papers · 2018–2026 · 8 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+12 more ↓ Show less ↑
π Academic Marathon (7) π Conference Polyglot (8) π§ Keyword Pioneer π Interdisciplinary Bridge π Cross-Pollinator (5)
π
Cross-Pollinator
(5)
π
Renaissance Researcher
(5)
πΊοΈ
Taxonomy Completionist
(63)
π€
Dynamic Duo
(12)
π
Keyword Champion
(4)
π§¬
Topic Evolution
β‘
Prolific Year
(7)
π
Conference Pioneer
π
Century Club
(30)
ποΈ
Keyword Collector
(151)
π₯
Unstoppable
(8)
β
The Questioner
Conferences
CVPR (11)
AAAI (6)
ECCV (5)
ICCV (4)
NIPS (3)
ACL (2)
EMNLP (1)
ICML (1)
Top co-authors
Keywords
video understanding
(7)
self-supervised learning
(7)
contrastive learning
(6)
video representation learning
(4)
action recognition
(3)
semantic segmentation
(3)
large language model
(3)
video large language model
(2)
audio representation
(2)
cross-modal learning
(2)
benchmark evaluation
(2)
depth estimation
(2)
attention mechanism
(2)
vision transformer
(2)
data augmentation
(2)
multimodal learning
(2)
video language model
(2)
sound source localization
(1)
source separation
(1)
semi-supervised learning
(1)
Papers
SplatSSC: Decoupled Depth-Guided Gaussian Splatting for Semantic Scene Completion
AAAI 2026
AnchorSeg: Language Grounded Query Banks for Reasoning Segmentation
ACL 2026
CogStream: Context-guided Streaming Video Question Answering
AAAI 2026
OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?
CVPR 2025
SongComposer: A Large Language Model for Lyric and Melody Generation in Song Composition
ACL 2025
Dispider: Enabling Video LLMs with Active Real-Time Interaction via Disentangled Perception, Decision, and Reaction
CVPR 2025
Reasoning to Attend: Try to Understand How <SEG> Token Works
CVPR 2025
SolEval: Benchmarking Large Language Models for Repository-level Solidity Smart Contract Generation
EMNLP 2025
SAM2Long: Enhancing SAM 2 for Long Video Segmentation with a Training-Free Memory Tree
ICCV 2025
VideoPrism: A Foundational Visual Encoder for Video Understanding
ICML 2024
Streaming Long Video Understanding with Large Language Models
NIPS 2024
Betrayed by Attention: A Simple yet Effective Approach for Self-supervised Video Object Segmentation
ECCV 2024
Rethinking Image-to-Video Adaptation: An Object-centric Perspective
ECCV 2024
Semantics Meets Temporal Correspondence: Self-supervised Object-centric Learning in Videos
ICCV 2023
Taming Diffusion Models for Audio-Driven Co-Speech Gesture Generation
CVPR 2023
Prune Spatio-temporal Tokens by Semantic-aware Temporal Accumulation
ICCV 2023
TA2N: Two-Stage Action Alignment Network for Few-Shot Action Recognition
AAAI 2022
Motion-Aware Contrastive Video Representation Learning via Foreground-Background Merging
CVPR 2022
Learning Hierarchical Cross-Modal Association for Co-Speech Gesture Generation
CVPR 2022
Contextualized Spatio-Temporal Contrastive Learning With Self-Supervision
CVPR 2022
Exploring Fine-Grained Audiovisual Categorization with the SSW60 Dataset
ECCV 2022
Static and Dynamic Concepts for Self-Supervised Video Representation Learning
ECCV 2022
Visual Sound Localization in the Wild by Cross-Modal Interference Erasing
AAAI 2022
Enhancing Self-Supervised Video Representation Learning via Multi-Level Feature Optimization
ICCV 2021
VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text
NIPS 2021
Spatiotemporal Contrastive Video Representation Learning
CVPR 2021
Simple Copy-Paste Is a Strong Data Augmentation Method for Instance Segmentation
CVPR 2021
End-to-End Pseudo-LiDAR for Image-Based 3D Object Detection
CVPR 2020
Discriminative Sounding Objects Localization via Self-supervised Audiovisual Matching
NIPS 2020
Multiple Sound Sources Localization from Coarse to Fine
ECCV 2020
Finding Action Tubes with a Sparse-to-Dense Framework
AAAI 2020
Weakly Supervised Scene Parsing with Point-Based Distance Metric Learning
AAAI 2019
Attentive Generative Adversarial Network for Raindrop Removal From a Single Image
CVPR 2018