Rui Zhao

134 papers · 2013–2026 · 18 conferences · across top CS/AI conferences

Achievements

+18 more ↓

🗺️ Taxonomy Completionist (21) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (7) 🌍 Conference Polyglot (17)

🐝 Cross-Pollinator (10) 🌈 Renaissance Researcher (7) 🌉 Interdisciplinary Bridge 🏠 Conference Loyalist (37) 🌟 Keyword Trendsetter Combo (3) 🤝 Dynamic Duo (25) 👑 Triple Crown 🧬 Topic Evolution 🏆 Grand Slam 🔬 Deep Specialist (14) 🏆 Keyword Champion (3) 🔥 Unstoppable (13) 🚀 Conference Pioneer ⚡ Prolific Year (18) ❓ The Questioner (3) 🗃️ Keyword Collector (507) 💎 Century Club (126) 📈 Trend Setter

Conferences

CVPR (37) AAAI (16) NIPS (13) ECCV (11) ICCV (10) ACL (8) ICLR (8) INTERSPEECH (7) EMNLP (5) ICML (5) COLING (3) IJCAI (3) AISTATS (2) NAACL (2) CORL (1) IJCNLP (1) AACL (1) WACV (1)

Top co-authors

Feng Zhu (25) Wanli Ouyang (14) Ziyue Li (11) SHIXIANG TANG (11) hongsheng Li (11) Tiejun Huang (10) LEI BAI (10) Ruiqin Xiong (10) Liwei Wu (9) Mike Zheng Shou (9)

Research topics

Education (1)

Keywords

person re-identification (10) large language model (9) domain adaptation (8) contrastive learning (8) representation learning (8) object detection (7) diffusion model (7) self-supervised learning (7) transfer learning (6) spike camera (6) semantic segmentation (6) neural network (5) attention mechanism (5) motion estimation (5) optical flow (5) pose estimation (5) deep learning (5) zero-shot learning (4) automatic speech recognition (4) text-to-image generation (4)

Papers

Detecting What Queries Seek: Steering LLM Safety with FFN Output Activation Monitoring ACL 2026 Adaptive-Smooth LiDAR-Camera Knowledge Distillation with Heterogeneous Fusion for Multi-View 3D Object Detection AAAI 2026 Spike Stream Memory Transfer for Dynamic Scene Reconstruction AAAI 2026 CNSL-bench: Benchmarking the Sign Language Understanding Capabilities of MLLMs on Chinese National Sign Language ACL 2026 Hyperbolic Hierarchical Alignment Reasoning Network for Text-3D Retrieval AAAI 2026 PLaST: Towards Paralinguistic-aware Speech Translation AAAI 2026 Selective Contrastive Learning For Gloss Free Sign Language Translation ACL 2026 DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles CVPR 2025 Dynamic Feature Fusion for Sign Language Translation Using HyperNetworks NAACL 2025 RemDet: Rethinking Efficient Model Design for UAV Object Detection AAAI 2025 Re-Aligning Language to Visual Objects with an Agentic Workflow ICLR 2025 KITS: Inductive Spatio-Temporal Kriging with Increment Training Strategy AAAI 2025 ISP2HRNet: Learning to Reconstruct High Resolution Image from Irregularly Sampled Pixels via Hierarchical Gradient Learning ICCV 2025 PUMA: Empowering Unified MLLM with Multi-granular Visual Generation ICCV 2025 SAMPLE: Semantic Alignment through Temporal-Adaptive Multimodal Prompt Learning for Event-Based Open-Vocabulary Action Recognition ICCV 2025 Can an Individual Manipulate the Collective Decisions of Multi-Agents? EMNLP 2025 TimeCMA: Towards LLM-Empowered Multivariate Time Series Forecasting via Cross-Modality Alignment AAAI 2025 Reasoning RAG via System 1 or System 2: A Survey on Reasoning Agentic Retrieval-Augmented Generation for Industry Challenges AACL 2025 Unlocking the Power of SAM 2 for Few-Shot Segmentation ICML 2025 Tree-KG: An Expandable Knowledge Graph Construction Framework for Knowledge-intensive Domains ACL 2025 Representation Purification for End-to-End Speech Translation COLING 2025 Enhancing Extractive Question Answering in Multiparty Dialogues with Logical Inference Memory Network COLING 2025 Reasoning RAG via System 1 or System 2: A Survey on Reasoning Agentic Retrieval-Augmented Generation for Industry Challenges IJCNLP 2025 Towards Cross-Modality Modeling for Time Series Analytics: A Survey in the LLM Era IJCAI 2025 Position: Current Model Licensing Practices are Dragging Us into a Quagmire of Legal Noncompliance ICML 2025 CLEAR: Can Language Models Really Understand Causal Graphs? EMNLP 2024 Reward Difference Optimization For Sample Reweighting In Offline RLHF EMNLP 2024 TPTU-v2: Boosting Task Planning and Tool Usage of Large Language Model-based Agents in Real-world Industry Systems EMNLP 2024 CMR Scaling Law: Predicting Critical Mixture Ratios for Continual Pre-training of Language Models EMNLP 2024 SDA: Simple Discrete Augmentation for Contrastive Sentence Representation Learning COLING 2024 Revisiting Domain-Adaptive Object Detection in Adverse Weather by the Generation and Composition of High-Quality Pseudo-Labels ECCV 2024 Drag Anything: Motion Control for Anything using Entity Representation ECCV 2024 Eliminating Feature Ambiguity for Few-Shot Segmentation ECCV 2024 Boosting Spike Camera Image Reconstruction from a Perspective of Dealing with Spike Fluctuations CVPR 2024 Instruct-ReID: A Multi-purpose Person Re-identification Task with Instructions CVPR 2024 X-Adapter: Adding Universal Compatibility of Plugins for Upgraded Diffusion Model CVPR 2024 VideoSwap: Customized Video Subject Swapping with Interactive Semantic Point Correspondence CVPR 2024 Estimating Noisy Class Posterior with Part-level Labels for Noisy Label Learning CVPR 2024 Sparse Global Matching for Video Frame Interpolation with Large Motion CVPR 2024 DynVideo-E: Harnessing Dynamic NeRF for Large-Scale Motion- and View-Change Human-Centric Video Editing CVPR 2024 Self-Supervised Representation Learning from Arbitrary Scenarios CVPR 2024 Optical Flow for Spike Camera with Hierarchical Spatial-Temporal Spike Fusion AAAI 2024 Focus-Then-Decide: Segmentation-Assisted Reinforcement Learning AAAI 2024 Conditional Variational Autoencoder for Sign Language Translation with Cross-Modal Alignment AAAI 2024 Hybrid Mamba for Few-Shot Segmentation NIPS 2024 Balancing Speciality and Versatility: a Coarse to Fine Framework for Supervised Fine-tuning Large Language Model ACL 2024 EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models NIPS 2024 Non-Neighbors Also Matter to Kriging: A New Contrastive-Prototypical Learning AISTATS 2024 MotionDirector: Motion Customization of Text-to-Video Diffusion Models ECCV 2024 Signer Diversity-driven Data Augmentation for Signer-Independent Sign Language Translation NAACL 2024 X-Light: Cross-City Traffic Signal Control Using Transformer on Transformer as Meta Multi-Agent Reinforcement Learner IJCAI 2024 Gradient-based Visual Explanation for Transformer-based CLIP ICML 2024 Sequential Asynchronous Action Coordination in Multi-Agent Systems: A Stackelberg Decision Transformer Approach ICML 2024 InstructDET: Diversifying Referring Object Detection with Generalized Instructions ICLR 2024 Exploring Stochastic Autoregressive Image Modeling for Visual Representation AAAI 2023 Mix-of-Show: Decentralized Low-Rank Adaptation for Multi-Concept Customization of Diffusion Models NIPS 2023 Unsupervised Optical Flow Estimation with Dynamic Timing Representation for Spike Camera NIPS 2023 DatasetDM: Synthesizing Data with Perception Annotations Using Diffusion Models NIPS 2023 MeGraph: Capturing Long-Range Interactions by Alternating Local and Hierarchical Aggregation on Multi-Scaled Graph Hierarchy NIPS 2023 Described Object Detection: Liberating Object Detection with Flexible Expressions NIPS 2023 Learning to Super-resolve Dynamic Scenes for Neuromorphic Spike Camera AAAI 2023 Maximum Entropy Population-Based Training for Zero-Shot Human-AI Coordination AAAI 2023 PUnifiedNER: A Prompting-Based Unified NER System for Diverse Datasets AAAI 2023 What Makes Pre-trained Language Models Better Zero-shot Learners? ACL 2023 CWSeg: An Efficient and General Approach to Chinese Word Segmentation ACL 2023 Deeply Coupled Cross-Modal Prompt Learning ACL 2023 Uni6Dv2: Noise Elimination for 6D Pose Estimation AISTATS 2023 CORA: Adapting CLIP for Open-Vocabulary Detection With Region Prompting and Anchor Pre-Matching CVPR 2023 Balancing Logit Variation for Long-Tailed Semantic Segmentation CVPR 2023 UniHCP: A Unified Model for Human-Centric Perceptions CVPR 2023 Zero-Shot Text-to-Parameter Translation for Game Character Auto-Creation CVPR 2023 HumanBench: Towards General Human-Centric Perception With Projector Assisted Pretraining CVPR 2023 Human Preference Score: Better Aligning Text-to-Image Models with Human Preference ICCV 2023 Advancing Referring Expression Segmentation Beyond Single Image ICCV 2023 SparseMAE: Sparse Training Meets Masked Autoencoders ICCV 2023 Cycle-consistent Masked AutoEncoder for Unsupervised Domain Generalization ICLR 2023 Contextual Image Masking Modeling via Synergized Contrasting without View Augmentation for Faster and Better Visual Pretraining ICLR 2023 Patch-Level Contrasting without Patch Correspondence for Accurate and Dense Contrastive Representation Learning ICLR 2023 Relative Contrastive Loss for Unsupervised Representation Learning ECCV 2022 Domain Invariant Masked Autoencoders for Self-Supervised Learning from Multi-Domains ECCV 2022 UniVIP: A Unified Framework for Self-Supervised Visual Pre-Training CVPR 2022 Revisiting the Transferability of Supervised Pretraining: An MLP Perspective CVPR 2022 Align Representations With Base: A New Approach to Self-Supervised Learning CVPR 2022 Feature Erasing and Diffusion Network for Occluded Person Re-Identification CVPR 2022 Optical Flow Estimation for Spiking Camera CVPR 2022 Uni6D: A Unified CNN Framework Without Projection Breakdown for 6D Pose Estimation CVPR 2022 Unsupervised Object Detection Pretraining with Joint Object Priors Generation and Detector Learning NIPS 2022 Learning Optical Flow from Continuous Spike Streams NIPS 2022 Zero-CL: Instance and Feature decorrelation for negative-free symmetric contrastive learning ICLR 2022 Learning from Future: A Novel Self-Training Framework for Semantic Segmentation NIPS 2022 Obj2Seq: Formatting Objects as Sequences with Class Prompt for Visual Tasks NIPS 2022 Spatio-Temporal Recurrent Networks for Event-Based Optical Flow Estimation AAAI 2022 Semi-Supervised Semantic Segmentation Using Unreliable Pseudo-Labels CVPR 2022 Scale-Aware Spatio-Temporal Relation Learning for Video Anomaly Detection ECCV 2022 Counterfactual Intervention Feature Transfer for Visible-Infrared Person Re-identification ECCV 2022 Unifying Visual Contrastive Learning for Object Recognition from a Graph Perspective ECCV 2022 An Automated Framework for Supporting Data-Governance Rule Compliance in Decentralized MIMO Contexts IJCAI 2021 Progressive Correspondence Pruning by Consensus Learning ICCV 2021 MST: Masked Self-Supervised Transformer for Visual Representation NIPS 2021 Improving RNN-T for Domain Scaling Using Semi-Supervised Training with Neural TTS INTERSPEECH 2021 Mutual Information State Intrinsic Control ICLR 2021 Continual Representation Learning for Biometric Identification WACV 2021 COCAS: A Large-Scale Clothes Changing Person Dataset for Re-Identification CVPR 2020 Self-supervising Fine-grained Region Similarities for Large-scale Image Localization ECCV 2020 RBF-Softmax: Learning Deep Representative Prototypes with Radial Basis Function Softmax ECCV 2020 On the Comparison of Popular End-to-End Models for Large Scale Speech Recognition INTERSPEECH 2020 Combination of End-to-End and Hybrid Models for Speech Recognition INTERSPEECH 2020 Transfer Learning Approaches for Streaming End-to-End Speech Recognition System INTERSPEECH 2020 Developing RNN-T Models Surpassing High-Performance Hybrid Models with Customization Capability INTERSPEECH 2020 Learning to Cluster Faces via Confidence and Connectivity Estimation CVPR 2020 Bayesian Adversarial Human Motion Synthesis CVPR 2020 Density-Aware Feature Embedding for Face Clustering CVPR 2020 Self-paced Contrastive Learning with Hybrid Memory for Domain Adaptive Object Re-ID NIPS 2020 Maximum Entropy-Regularized Multi-Goal Reinforcement Learning ICML 2019 The Rensselaer Mandarin Project — A Cognitive and Immersive Language Learning Environment AAAI 2019 Generalizing Eye Tracking With Bayesian Adversarial Learning CVPR 2019 Memory-Based Neighbourhood Embedding for Visual Recognition ICCV 2019 Bayesian Graph Convolution LSTM for Skeleton Based Action Recognition ICCV 2019 AdaCos: Adaptively Scaling Cosine Logits for Effectively Learning Deep Face Representations CVPR 2019 P2SGrad: Refined Gradients for Optimizing Deep Face Models CVPR 2019 Bayesian Hierarchical Dynamic Model for Human Action Recognition CVPR 2019 Bilateral Ordinal Relevance Multi-Instance Regression for Facial Action Unit Intensity Estimation CVPR 2018 QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension ICLR 2018 A Hierarchical Generative Model for Eye Image Synthesis and Eye Gaze Estimation CVPR 2018 Energy-Based Hindsight Experience Prioritization CORL 2018 Improved Training for Online End-to-end Speech Recognition Systems INTERSPEECH 2018 Attention-Aware Compositional Network for Person Re-Identification CVPR 2018 Large-Scale Domain Adaptation via Teacher-Student Learning INTERSPEECH 2017 Facial Expression Intensity Estimation Using Ordinal Information CVPR 2016 Saliency Detection by Multi-Context Deep Learning CVPR 2015 Learning Mid-level Filters for Person Re-identification CVPR 2014 DeepReID: Deep Filter Pairing Neural Network for Person Re-Identification CVPR 2014 Unsupervised Salience Learning for Person Re-identification CVPR 2013 Person Re-identification by Salience Matching ICCV 2013