Ran Xu
46 papers · 2014–2026 · 13 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+12 more ↓ Show less ↑
π Conference Polyglot (13) π Academic Marathon (11) π Interdisciplinary Bridge π§ Keyword Pioneer π Cross-Pollinator (5)
π
Cross-Pollinator
(5)
π
Renaissance Researcher
(11)
πΊοΈ
Taxonomy Completionist
(96)
π§¬
Topic Evolution
π
Triple Crown
π₯
Mega-Team
(71)
π€
Dynamic Duo
(21)
π
Grand Slam
π₯
Unstoppable
(5)
ποΈ
Keyword Collector
(198)
β‘
Prolific Year
(9)
π
Century Club
(44)
Conferences
CVPR (10)
ACL (8)
ECCV (5)
EMNLP (5)
ICCV (5)
AAAI (4)
NIPS (3)
COLING (1)
ICLR (1)
ICML (1)
IJCAI (1)
NAACL (1)
WACV (1)
Top co-authors
Research topics
Keywords
large language model
(8)
domain adaptation
(5)
vision-language model
(5)
video understanding
(4)
contrastive learning
(4)
multimodal learning
(4)
diffusion model
(3)
representation learning
(3)
few-shot learning
(3)
point cloud
(3)
preference optimization
(3)
retrieval-augmented generation
(3)
catastrophic forgetting
(2)
action recognition
(2)
human motion
(2)
controllable generation
(2)
question answering
(2)
benchmark evaluation
(2)
visual question answering
(2)
weakly supervised learning
(2)
Papers
OpenRubrics: Towards Scalable Synthetic Rubric Generation for Reward Modeling and LLM Alignment
ACL 2026
Decomposing the Neurons: Activation Sparsity via Mixture of Experts for Continual Test Time Adaptation
AAAI 2026
RoseRAG: Robust Retrieval-augmented Generation with Small-scale LLMs via Margin-aware Preference Optimization
ACL 2025
Trust but Verify: Programmatic VLM Evaluation in the Wild
ICCV 2025
Contra4: Evaluating Contrastive Cross-Modal Reasoning in Audio, Video, Image, and 3D
EMNLP 2025
SimRAG: Self-Improving Retrieval-Augmented Generation for Adapting Large Language Models to Specialized Domains
NAACL 2025
Structured Policy Optimization: Enhance Large Vision-Language Model via Self-referenced Dialogue
ICCV 2025
Towards Scalable Spatial Intelligence via 2D-to-3D Data Lifting
ICCV 2025
Retrieval-augmented GUI Agents with Generative Guidelines
EMNLP 2025
Text2Data: Low-Resource Data Generation with Textual Control
AAAI 2025
Position: TrustLLM: Trustworthiness in Large Language Models
ICML 2024
MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens
NIPS 2024
FOFO: A Benchmark to Evaluate LLMsβ Format-Following Capability
ACL 2024
MapGPT: Map-Guided Prompting with Adaptive Path Planning for Vision-and-Language Navigation
ACL 2024
RAM-EHR: Retrieval Augmentation Meets Clinical Predictions on Electronic Health Records
ACL 2024
Knowledge-Infused Prompting: Assessing and Advancing Clinical Text Data Generation with Large Language Models
ACL 2024
ULIP-2: Towards Scalable Multimodal Pre-training for 3D Understanding
CVPR 2024
Continual-MAE: Adaptive Distribution Masked Autoencoders for Continual Test-Time Adaptation
CVPR 2024
HIVE: Harnessing Human Feedback for Instructional Visual Editing
CVPR 2024
SQ-LLaVA: Self-Questioning for Large Vision-Language Assistant
ECCV 2024
LayoutDETR: Detection Transformer Is a Good Multimodal Layout Designer
ECCV 2024
"X-InstructBLIP: A Framework for Aligning Image, 3D, Audio, Video to LLMs and its Emergent Cross-modal Reasoning"
ECCV 2024
BMRetriever: Tuning Large Language Models as Better Biomedical Text Retrievers
EMNLP 2024
MedAdapter: Efficient Test-Time Adaptation of Large Language Models Towards Medical Reasoning
EMNLP 2024
EHRAgent: Code Empowers Large Language Models for Few-shot Complex Tabular Reasoning on Electronic Health Records
EMNLP 2024
Retroformer: Retrospective Large Language Agents with Policy Gradient Optimization
ICLR 2024
GlueGen: Plug and Play Multi-modal Encoders for X-to-image Generation
ICCV 2023
ULIP: Learning a Unified Representation of Language, Images, and Point Clouds for 3D Understanding
CVPR 2023
Open Visual Knowledge Extraction via Relation-Oriented Multimodality Model Prompting
NIPS 2023
Neighborhood-Regularized Self-Training for Learning with Few Labels
AAAI 2023
UniControl: A Unified Diffusion Model for Controllable Visual Generation In the Wild
NIPS 2023
Tackling Data Heterogeneity in Federated Learning with Class Prototypes
AAAI 2023
Cold-Start Data Selection for Better Few-shot Language Model Fine-tuning: A Prompt-based Uncertainty Propagation Approach
ACL 2023
Mask-Free OVIS: Open-Vocabulary Instance Segmentation Without Manual Mask Annotations
CVPR 2023
Deformer: Dynamic Fusion Transformer for Robust Hand Pose Estimation
ICCV 2023
Burn after Reading: Online Adaptation for Cross-Domain Streaming Data
ECCV 2022
Visual Emotion Representation Learning via Emotion-Aware Pre-training
IJCAI 2022
SmartAdapt: Multi-Branch Object Detection Framework for Videos on Mobiles
CVPR 2022
Use All the Labels: A Hierarchical Multi-Label Contrastive Learning Framework
CVPR 2022
DocQueryNet: Value Retrieval with Arbitrary Queries for Form-like Documents
COLING 2022
Field Extraction from Forms with Unlabeled Data
ACL 2022
Open Vocabulary Object Detection with Pseudo Bounding-Box Labels
ECCV 2022
WOAD: Weakly Supervised Online Action Detection in Untrimmed Videos
CVPR 2021
Proposal Learning for Semi-Supervised Object Detection
WACV 2021
Human Action Segmentation With Hierarchical Supervoxel Consistency
CVPR 2015
Actionness Ranking with Lattice Conditional Ordinal Random Fields
CVPR 2014