Xiangyu Yue
45 papers · 2018–2026 · 11 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+12 more ↓ Show less ↑
π Cross-Pollinator (14) π Academic Marathon (7) π§ Keyword Pioneer π Conference Polyglot (11) π Renaissance Researcher (6)
π
Renaissance Researcher
(6)
π
Interdisciplinary Bridge
πΊοΈ
Taxonomy Completionist
(76)
π₯
Mega-Team
(20)
π¬
Deep Specialist
(11)
π
Grand Slam
π§¬
Topic Evolution
π₯
Unstoppable
(8)
ποΈ
Keyword Collector
(191)
β‘
Prolific Year
(9)
π
Trend Setter
π
Century Club
(42)
Conferences
ICCV (15)
CVPR (10)
ACL (4)
ECCV (4)
NIPS (4)
AAAI (3)
CORL (1)
ICLR (1)
ICML (1)
IJCAI (1)
WACV (1)
Top co-authors
Research topics
Keywords
multimodal learning
(8)
semantic segmentation
(7)
multimodal large language model
(5)
large language model
(5)
diffusion model
(5)
autonomous driving
(4)
domain adaptation
(4)
semi-supervised learning
(3)
representation learning
(3)
vision-language model
(3)
neural network
(3)
knowledge distillation
(3)
object detection
(3)
foundation model
(3)
multi-task learning
(2)
image recognition
(2)
video understanding
(2)
text-to-image generation
(2)
point cloud
(2)
image generation
(2)
Papers
SpatialLogic-Bench: A Diagnostic Benchmark for Task-Oriented Spatiotemporal Reasoning
AAAI 2026
Learning While Staying Curious: Entropy-Preserving Supervised Fine-Tuning via Adaptive Self-Distillation for Large Reasoning Models
ACL 2026
Probing Audio-Visual Reasoning in Multimodal Language Models through the Lens of Audio
ACL 2026
HypDAE: Hyperbolic Diffusion Autoencoders for Hierarchical Few-shot Image Generation
ICCV 2025
Unleashing Vecset Diffusion Model for Fast Shape Generation
ICCV 2025
FairGen: Enhancing Fairness in Text-to-Image Diffusion Models via Self-Discovering Latent Directions
ICCV 2025
From Easy to Hard: Progressive Active Learning Framework for Infrared Small Target Detection with Single Point Supervision
ICCV 2025
Chimera: Improving Generalist Model with Domain-Specific Experts
ICCV 2025
CMT: A Cascade MAR with Topology Predictor for Multimodal Conditional CAD Generation
ICCV 2025
Divide and Conquer: Grounding LLMs as Efficient Decision-Making Agents via Offline Hierarchical Reinforcement Learning
ICML 2025
Training Matting Models Without Alpha Labels
AAAI 2025
Reflective Planning: Vision-Language Models for Multi-Stage Long-Horizon Robotic Manipulation
CORL 2025
UniSTD: Towards Unified Spatio-Temporal Learning across Diverse Disciplines
CVPR 2025
SemGeoMo: Dynamic Contextual Human Motion Generation with Semantic and Geometric Guidance
CVPR 2025
DiTCtrl: Exploring Attention Control in Multi-Modal Diffusion Transformer for Tuning-Free Multi-Prompt Longer Video Generation
CVPR 2025
RAP: Retrieval-Augmented Personalization for Multimodal Large Language Models
CVPR 2025
Unposed Sparse Views Room Layout Reconstruction in the Age of Pretrain Model
ICLR 2025
Learning Beyond Still Frames: Scaling Vision-Language Models with Video
ICCV 2025
Breaking the Encoder Barrier for Seamless Video-Language Understanding
ICCV 2025
HiddenDetect: Detecting Jailbreak Attacks against Multimodal Large Language Models via Monitoring Hidden States
ACL 2025
Scaling Omni-modal Pretraining with Multimodal Context: Advancing Universal Representation Learning Across Modalities
ICCV 2025
SynFER: Towards Boosting Facial Expression Recognition with Synthetic Data
ICCV 2025
Online Vectorized HD Map Construction using Geometry
ECCV 2024
EMR-Merging: Tuning-Free High-Performance Model Merging
NIPS 2024
$\textit{Bifr\"ost}$: 3D-Aware Image Compositing with Language Instructions
NIPS 2024
Lumina-Next : Making Lumina-T2X Stronger and Faster with Next-DiT
NIPS 2024
Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimization
ACL 2024
OneLLM: One Framework to Align All Modalities with Language
CVPR 2024
Multimodal Pathway: Improve Transformers with Irrelevant Data from Other Modalities
CVPR 2024
UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio Video Point Cloud Time-Series and Image Recognition
CVPR 2024
Better Regression Makes Better Test-time Adaptive 3D Object Detection
ECCV 2024
Beating Backdoor Attack at Its Own Game
ICCV 2023
Preventing Zero-Shot Transfer Degradation in Continual Learning of Vision-Language Models
ICCV 2023
Space Engage: Collaborative Space Supervision for Contrastive-Based Semi-Supervised Semantic Segmentation
ICCV 2023
Conditional Synthetic Data Generation for Robust Machine Learning Applications with Limited Pandemic Data
AAAI 2022
Image2Point: 3D Point-Cloud Understanding with 2D Image Pretrained Models
ECCV 2022
RankSeg: Adaptive Pixel Classification with Image Category Ranking for Segmentation
ECCV 2022
Self-Supervised Pretraining Improves Self-Supervised Pretraining
WACV 2022
Unsupervised Point Cloud Pre-Training via Occlusion Completion
ICCV 2021
Prototypical Cross-Domain Self-Supervised Learning for Few-Shot Unsupervised Domain Adaptation
CVPR 2021
PolarNet: An Improved Grid Representation for Online LiDAR Point Clouds Semantic Segmentation
CVPR 2020
Domain Randomization and Pyramid Consistency: Simulation-to-Real Generalization Without Accessing Target Domain Data
ICCV 2019
Multi-source Domain Adaptation for Semantic Segmentation
NIPS 2019
Counterexample-Guided Data Augmentation
IJCAI 2018
Shift: A Zero FLOP, Zero Parameter Alternative to Spatial Convolutions
CVPR 2018