Zhaoxin Fan

24 papers · 2022–2026 · 9 conferences · across top CS/AI conferences

Achievements

+6 more ↓

🐝 Cross-Pollinator (8) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (9) 🌈 Renaissance Researcher (7)

🌈 Renaissance Researcher (7) 🗺️ Taxonomy Completionist (48) 💎 Century Club (20) 🗃️ Keyword Collector (108) ⚡ Prolific Year (11) 🔥 Unstoppable (5)

Conferences

CVPR (5) AAAI (4) ACL (4) ICCV (4) ECCV (2) IJCAI (2) COLING (1) CORL (1) ICML (1)

Top co-authors

Hongyan Liu (6) Jun He (6) Zhenbo Song (4) Jihao Zhao (3) Yongcai Wang (3) Shuo Wang (3) Bo Tang (3) Feiyu Xiong (3) Zhiyu Li (3) Ziqiao Peng (3)

Keywords

3d reconstruction (2) speech-driven animation (2) large language model (2) facial animation (2) adversarial attack (2) diffusion model (2) retrieval-augmented generation (2) vision-language navigation (1) multimodal learning (1) adversarial robustness (1) conversational ai (1) emotion recognition (1) point cloud (1) visual question answering (1) semi-supervised learning (1) machine unlearning (1) object tracking (1) pose estimation (1) deep learning (1) scene reconstruction (1)

Papers

Mem4D: Decoupling Static and Dynamic Memory for Dynamic Scene Reconstruction AAAI 2026 Inside Out: Evolving User-Centric Core Memory Trees for Long-Term Personalized Dialogue Systems ACL 2026 PEAP: Proactive Embodied Action Sequence Planning with Joint Understanding of Vision and Audio Perception ACL 2026 MonoDream: Monocular Vision-Language Navigation with Panoramic Dreaming AAAI 2026 JTD-UAV: MLLM-Enhanced Joint Tracking and Description Framework for Anti-UAV Systems CVPR 2025 MambaVO: Deep Visual Odometry Based on Sequential Matching Refinement and Training Smoothing CVPR 2025 DualTalk: Dual-Speaker Interaction for 3D Talking Head Conversations CVPR 2025 Long-VLA: Unleashing Long-Horizon Capability of Vision Language Action Model for Robot Manipulation CORL 2025 SafeRAG: Benchmarking Security in Retrieval-Augmented Generation of Large Language Model ACL 2025 MoC: Mixtures of Text Chunking Learners for Retrieval-Augmented Generation System ACL 2025 Idea23D: Collaborative LMM Agents Enable 3D Model Generation from Interleaved Multimodal Inputs COLING 2025 Moderating the Generalization of Score-based Generative Model ICCV 2025 CARP: Visuomotor Policy Learning via Coarse-to-Fine Autoregressive Prediction ICCV 2025 EraseAnything: Enabling Concept Erasure in Rectified Flow Transformers ICML 2025 GLDiTalker: Speech-Driven 3D Facial Animation with Graph Latent Diffusion Transformer IJCAI 2025 Everything2Motion: Synchronizing Diverse Inputs via a Unified Framework for Human Motion Synthesis AAAI 2024 SyncTalk: The Devil is in the Synchronization for Talking Head Synthesis CVPR 2024 MLPHand: Real Time Multi-View 3D Hand Reconstruction via MLP Modeling ECCV 2024 D-IF: Uncertainty-aware Human Digitization via Implicit Distribution Field ICCV 2023 Robust Single Image Reflection Removal Against Adversarial Attacks CVPR 2023 Reconstruction-Aware Prior Distillation for Semi-supervised Point Cloud Completion IJCAI 2023 EmoTalk: Speech-Driven Emotional Disentanglement for 3D Face Animation ICCV 2023 SVT-Net: Super Light-Weight Sparse Voxel Transformer for Large Scale Place Recognition AAAI 2022 Object Level Depth Reconstruction for Category Level 6D Object Pose Estimation from Monocular RGB Image ECCV 2022