Xiangyu Yue

45 papers · 2018–2026 · 11 conferences · across top CS/AI conferences

Achievements

+12 more ↓

🐝 Cross-Pollinator (14) 🏃 Academic Marathon (7) 🧭 Keyword Pioneer 🌍 Conference Polyglot (11) 🌈 Renaissance Researcher (6)

🌈 Renaissance Researcher (6) 🌉 Interdisciplinary Bridge 🗺️ Taxonomy Completionist (76) 👥 Mega-Team (20) 🔬 Deep Specialist (11) 🏆 Grand Slam 🧬 Topic Evolution 🔥 Unstoppable (8) 🗃️ Keyword Collector (191) ⚡ Prolific Year (9) 📈 Trend Setter 💎 Century Club (42)

Conferences

ICCV (15) CVPR (10) ACL (4) ECCV (4) NIPS (4) AAAI (3) CORL (1) ICLR (1) ICML (1) IJCAI (1) WACV (1)

Top co-authors

Yiyuan Zhang (8) Kurt Keutzer (7) Kaixiong Gong (6) Wanli Ouyang (5) Alberto Sangiovanni-Vincentelli (4) Tao Chen (4) Sicheng Zhao (4) Jiaming Han (4) Yu Qiao (4) Jing Liu (3)

Research topics

Applications (1)

Keywords

multimodal learning (8) semantic segmentation (7) multimodal large language model (5) large language model (5) diffusion model (5) autonomous driving (4) domain adaptation (4) semi-supervised learning (3) representation learning (3) vision-language model (3) neural network (3) knowledge distillation (3) object detection (3) foundation model (3) multi-task learning (2) image recognition (2) video understanding (2) text-to-image generation (2) point cloud (2) image generation (2)

Papers

SpatialLogic-Bench: A Diagnostic Benchmark for Task-Oriented Spatiotemporal Reasoning AAAI 2026 Learning While Staying Curious: Entropy-Preserving Supervised Fine-Tuning via Adaptive Self-Distillation for Large Reasoning Models ACL 2026 Probing Audio-Visual Reasoning in Multimodal Language Models through the Lens of Audio ACL 2026 HypDAE: Hyperbolic Diffusion Autoencoders for Hierarchical Few-shot Image Generation ICCV 2025 Unleashing Vecset Diffusion Model for Fast Shape Generation ICCV 2025 FairGen: Enhancing Fairness in Text-to-Image Diffusion Models via Self-Discovering Latent Directions ICCV 2025 From Easy to Hard: Progressive Active Learning Framework for Infrared Small Target Detection with Single Point Supervision ICCV 2025 Chimera: Improving Generalist Model with Domain-Specific Experts ICCV 2025 CMT: A Cascade MAR with Topology Predictor for Multimodal Conditional CAD Generation ICCV 2025 Divide and Conquer: Grounding LLMs as Efficient Decision-Making Agents via Offline Hierarchical Reinforcement Learning ICML 2025 Training Matting Models Without Alpha Labels AAAI 2025 Reflective Planning: Vision-Language Models for Multi-Stage Long-Horizon Robotic Manipulation CORL 2025 UniSTD: Towards Unified Spatio-Temporal Learning across Diverse Disciplines CVPR 2025 SemGeoMo: Dynamic Contextual Human Motion Generation with Semantic and Geometric Guidance CVPR 2025 DiTCtrl: Exploring Attention Control in Multi-Modal Diffusion Transformer for Tuning-Free Multi-Prompt Longer Video Generation CVPR 2025 RAP: Retrieval-Augmented Personalization for Multimodal Large Language Models CVPR 2025 Unposed Sparse Views Room Layout Reconstruction in the Age of Pretrain Model ICLR 2025 Learning Beyond Still Frames: Scaling Vision-Language Models with Video ICCV 2025 Breaking the Encoder Barrier for Seamless Video-Language Understanding ICCV 2025 HiddenDetect: Detecting Jailbreak Attacks against Multimodal Large Language Models via Monitoring Hidden States ACL 2025 Scaling Omni-modal Pretraining with Multimodal Context: Advancing Universal Representation Learning Across Modalities ICCV 2025 SynFER: Towards Boosting Facial Expression Recognition with Synthetic Data ICCV 2025 Online Vectorized HD Map Construction using Geometry ECCV 2024 EMR-Merging: Tuning-Free High-Performance Model Merging NIPS 2024 $\textit{Bifr\"ost}$: 3D-Aware Image Compositing with Language Instructions NIPS 2024 Lumina-Next : Making Lumina-T2X Stronger and Faster with Next-DiT NIPS 2024 Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimization ACL 2024 OneLLM: One Framework to Align All Modalities with Language CVPR 2024 Multimodal Pathway: Improve Transformers with Irrelevant Data from Other Modalities CVPR 2024 UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio Video Point Cloud Time-Series and Image Recognition CVPR 2024 Better Regression Makes Better Test-time Adaptive 3D Object Detection ECCV 2024 Beating Backdoor Attack at Its Own Game ICCV 2023 Preventing Zero-Shot Transfer Degradation in Continual Learning of Vision-Language Models ICCV 2023 Space Engage: Collaborative Space Supervision for Contrastive-Based Semi-Supervised Semantic Segmentation ICCV 2023 Conditional Synthetic Data Generation for Robust Machine Learning Applications with Limited Pandemic Data AAAI 2022 Image2Point: 3D Point-Cloud Understanding with 2D Image Pretrained Models ECCV 2022 RankSeg: Adaptive Pixel Classification with Image Category Ranking for Segmentation ECCV 2022 Self-Supervised Pretraining Improves Self-Supervised Pretraining WACV 2022 Unsupervised Point Cloud Pre-Training via Occlusion Completion ICCV 2021 Prototypical Cross-Domain Self-Supervised Learning for Few-Shot Unsupervised Domain Adaptation CVPR 2021 PolarNet: An Improved Grid Representation for Online LiDAR Point Clouds Semantic Segmentation CVPR 2020 Domain Randomization and Pyramid Consistency: Simulation-to-Real Generalization Without Accessing Target Domain Data ICCV 2019 Multi-source Domain Adaptation for Semantic Segmentation NIPS 2019 Counterexample-Guided Data Augmentation IJCAI 2018 Shift: A Zero FLOP, Zero Parameter Alternative to Spatial Convolutions CVPR 2018