Zitong Yu

42 papers · 2019–2026 · 9 conferences · across top CS/AI conferences

Achievements

+9 more ↓

🏃 Academic Marathon (6) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (8) 🐝 Cross-Pollinator (12)

🏃 Academic Marathon (6) 🧭 Keyword Pioneer 🧬 Topic Evolution 🏆 Keyword Champion (5) 🗃️ Keyword Collector (155) 💎 Century Club (36) 🔥 Unstoppable (7) ❓ The Questioner ⚡ Prolific Year (5)

Conferences

AAAI (11) CVPR (9) ECCV (6) ICCV (6) IJCAI (4) MICCAI (3) ACL (1) EMNLP (1) ICLR (1)

Top co-authors

Guoying Zhao (9) Weicheng Xie (7) Xiaobai Li (6) Xin Liu (6) Jingang Shi (5) Zezheng Wang (5) Yue Sun (5) Tao Tan (5) Yunxiao Qin (4) Siyang Song (4)

Keywords

face anti-spoofing (9) remote photoplethysmography (5) domain generalization (5) multimodal learning (4) facial video (3) attention mechanism (3) video understanding (3) zero-shot learning (2) facial video analysis (2) presentation attack detection (2) contrastive learning (2) physiological measurement (2) action recognition (2) representation learning (2) multimodal large language model (2) audio-visual learning (2) reinforcement learning (2) vision transformer (2) few-shot learning (2) heart rate estimation (2)

Papers

SUGAR: Learning Skeleton Representation with Visual-Motion Knowledge for Action Recognition AAAI 2026 When Eyes and Ears Disagree: Can MLLMs Discern Audio-Visual Confusion? AAAI 2026 FaceShield: Explainable Face Anti-Spoofing with Multimodal Large Language Models AAAI 2026 PA-FAS: Towards Interpretable and Generalizable Multimodal Face Anti-Spoofing via Path-Augmented Reinforcement Learning AAAI 2026 H-GAR: A Hierarchical Interaction Framework via Goal-Driven Observation-Action Refinement for Robotic Manipulation AAAI 2026 Retrieving to Recover: Towards Incomplete Audio-Visual Question Answering via Semantic-consistent Purification ACL 2026 DADM: Dual Alignment of Domain and Modality for Face Anti-spoofing ICCV 2025 MSAmba: Exploring Multimodal Sentiment Analysis with State Space Models AAAI 2025 CA-Edit: Causality-Aware Condition Adapter for High-Fidelity Local Facial Attribute Editing AAAI 2025 Few-Shot Audio-Visual Class-Incremental Learning with Temporal Prompting and Regularization AAAI 2025 FSBench: A Figure Skating Benchmark for Advancing Artistic Sports Understanding CVPR 2025 MoEdit: On Learning Quantity Perception for Multi-object Image Editing CVPR 2025 Dynamic Collaboration of Multi-Language Models based on Minimal Complete Semantic Units EMNLP 2025 Enhancing Adversarial Transferability by Balancing Exploration and Exploitation with Gradient-Guided Sampling ICCV 2025 Kronecker Mask and Interpretive Prompts are Language-Action Video Learners ICLR 2025 Multimodal Fake News Detection: MFND Dataset and Shallow-Deep Multitask Learning IJCAI 2025 BiMSRec: A Progressive Image Reconstruction Framework for Medical Image Fusion Guided by Multi-Scale Deformation Fields MICCAI 2025 MedIQA: A Scalable Foundation Model for Prompt-Driven Medical Image Quality Assessment MICCAI 2025 TRRG: Towards Truthful Radiology Report Generation With Cross-modal Disease Clue Enhanced Large Language Models MICCAI 2025 DiffFAS: Face Anti-Spoofing via Generative Diffusion Models ECCV 2024 MTaDCS: Moving Trace and Feature Density-based Confidence Sample Selection under Label Noise ECCV 2024 Suppress and Rebalance: Towards Generalized Multi-Modal Face Anti-Spoofing CVPR 2024 CAT: Enhancing Multimodal Large Language Model to Answer Questions in Dynamic Audio-Visual Scenarios ECCV 2024 AUFormer: Vision Transformers are Parameter-Efficient Facial Action Unit Detectors ECCV 2024 Neuron Structure Modeling for Generalizable Remote Physiological Measurement CVPR 2023 Audio-Visual Deception Detection: DOLOS Dataset and Parameter-Efficient Crossmodal Learning ICCV 2023 Learning Motion-Robust Remote Photoplethysmography through Arbitrary Resolution Videos AAAI 2023 Rehearsal-Free Domain Continual Face Anti-Spoofing: Generalize More and Forget Less ICCV 2023 Domain Generalization via Shuffled Style Assembly for Face Anti-Spoofing CVPR 2022 PhysFormer: Facial Video-Based Physiological Measurement With Temporal Difference Transformer CVPR 2022 IDPT: Interconnected Dual Pyramid Transformer for Face Super-Resolution IJCAI 2022 Geometry-Contrastive Transformer for Generalized 3D Pose Transfer AAAI 2022 Non-contact Pain Recognition from Video Sequences with Remote Physiological Measurements Prediction IJCAI 2021 iMiGUE: An Identity-Free Video Dataset for Micro-Gesture Understanding and Emotion Analysis CVPR 2021 Pixel Difference Networks for Efficient Edge Detection ICCV 2021 Dual-Cross Central Difference Network for Face Anti-Spoofing IJCAI 2021 Video-based Remote Physiological Measurement via Cross-verified Feature Disentangling ECCV 2020 Searching Central Difference Convolutional Networks for Face Anti-Spoofing CVPR 2020 Face Anti-Spoofing with Human Material Perception ECCV 2020 Deep Spatial Gradient and Temporal Depth Learning for Face Anti-Spoofing CVPR 2020 Learning Meta Model for Zero- and Few-Shot Face Anti-Spoofing AAAI 2020 Remote Heart Rate Measurement From Highly Compressed Facial Videos: An End-to-End Deep Learning Solution With Video Enhancement ICCV 2019