Xi Yin

23 papers · 2017–2026 · 7 conferences · across top CS/AI conferences

Achievements

+9 more ↓

🏃 Academic Marathon (8) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (7) 🐝 Cross-Pollinator (5)

🐝 Cross-Pollinator (5) 🌈 Renaissance Researcher (6) 🗺️ Taxonomy Completionist (45) 🧬 Topic Evolution 🗃️ Keyword Collector (104) 💎 Century Club (22) 📈 Trend Setter 🔥 Unstoppable (7) ⚡ Prolific Year (5)

Conferences

CVPR (11) ECCV (4) ICCV (3) AAAI (2) EMNLP (1) ICLR (1) INTERSPEECH (1)

Top co-authors

Xiaoming Liu (7) Devi Parikh (5) Guan Pang (4) Tal Hassner (4) Thomas Hayes (4) Samaneh Azadi (3) Harry Yang (3) Lijuan Wang (3) Lei Zhang (3) LUAN TRAN (2)

Keywords

image captioning (3) optical character recognition (2) generative model (2) representation learning (2) text-to-image generation (2) disentangled representation (2) pose estimation (2) diffusion model (2) vision-language pretraining (2) anomaly detection (1) attention mechanism (1) visual question answering (1) machine translation (1) multilingual nlp (1) transfer learning (1) semantic segmentation (1) object detection (1) face recognition (1) face detection (1) domain generalization (1)

Papers

UI-R1: Enhancing Efficient Action Prediction of GUI Agents by Reinforcement Learning AAAI 2026 MotiF: Making Text Count in Image Animation with Motion Focal Loss CVPR 2025 Flowing from Words to Pixels: A Noise-Free Framework for Cross-Modality Evolution CVPR 2025 Generating Multi-Image Synthetic Data for Text-to-Image Customization ICCV 2025 Factorizing Text-to-Video Generation by Explicit Image Conditioning ECCV 2024 Make-A-Video: Text-to-Video Generation without Text-Video Data ICLR 2023 MaLP: Manipulation Localization Using a Proactive Scheme CVPR 2023 SpaText: Spatio-Textual Representation for Controllable Image Generation CVPR 2023 Towards Effective and Compact Contextual Representation for Conformer Transducer Speech Recognition Systems INTERSPEECH 2023 CCEval: A Representative Evaluation Benchmark for the Chinese-centric Multilingual Machine Translation EMNLP 2023 Long Video Generation with Time-Agnostic VQGAN and Time-Sensitive Transformer ECCV 2022 Proactive Image Manipulation Detection CVPR 2022 MUGEN: A Playground for Video-Audio-Text Multimodal Understanding and GENeration ECCV 2022 A Multiplexed Network for End-to-End, Multilingual OCR CVPR 2021 img2pose: Face Alignment and Detection via 6DoF, Face Pose Estimation CVPR 2021 TAP: Text-Aware Pre-Training for Text-VQA and Text-Caption CVPR 2021 VIVO: Visual Vocabulary Pre-Training for Novel Object Captioning AAAI 2021 Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks ECCV 2020 Gait Recognition via Disentangled Representation Learning CVPR 2019 Feature Transfer Learning for Face Recognition With Under-Represented Data CVPR 2019 Towards Large-Pose Face Frontalization in the Wild ICCV 2017 Illuminating Pedestrians via Simultaneous Detection & Segmentation ICCV 2017 Disentangled Representation Learning GAN for Pose-Invariant Face Recognition CVPR 2017