Zitong Yu
42 papers · 2019–2026 · 9 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+9 more ↓ Show less ↑
🏃 Academic Marathon (6) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (8) 🐝 Cross-Pollinator (12)
🏃
Academic Marathon
(6)
🧭
Keyword Pioneer
🧬
Topic Evolution
🏆
Keyword Champion
(5)
🗃️
Keyword Collector
(155)
💎
Century Club
(36)
🔥
Unstoppable
(7)
❓
The Questioner
⚡
Prolific Year
(5)
Conferences
AAAI (11)
CVPR (9)
ECCV (6)
ICCV (6)
IJCAI (4)
MICCAI (3)
ACL (1)
EMNLP (1)
ICLR (1)
Top co-authors
Keywords
face anti-spoofing
(9)
remote photoplethysmography
(5)
domain generalization
(5)
multimodal learning
(4)
facial video
(3)
attention mechanism
(3)
video understanding
(3)
zero-shot learning
(2)
facial video analysis
(2)
presentation attack detection
(2)
contrastive learning
(2)
physiological measurement
(2)
action recognition
(2)
representation learning
(2)
multimodal large language model
(2)
audio-visual learning
(2)
reinforcement learning
(2)
vision transformer
(2)
few-shot learning
(2)
heart rate estimation
(2)
Papers
SUGAR: Learning Skeleton Representation with Visual-Motion Knowledge for Action Recognition
AAAI 2026
When Eyes and Ears Disagree: Can MLLMs Discern Audio-Visual Confusion?
AAAI 2026
FaceShield: Explainable Face Anti-Spoofing with Multimodal Large Language Models
AAAI 2026
PA-FAS: Towards Interpretable and Generalizable Multimodal Face Anti-Spoofing via Path-Augmented Reinforcement Learning
AAAI 2026
H-GAR: A Hierarchical Interaction Framework via Goal-Driven Observation-Action Refinement for Robotic Manipulation
AAAI 2026
Retrieving to Recover: Towards Incomplete Audio-Visual Question Answering via Semantic-consistent Purification
ACL 2026
DADM: Dual Alignment of Domain and Modality for Face Anti-spoofing
ICCV 2025
MSAmba: Exploring Multimodal Sentiment Analysis with State Space Models
AAAI 2025
CA-Edit: Causality-Aware Condition Adapter for High-Fidelity Local Facial Attribute Editing
AAAI 2025
Few-Shot Audio-Visual Class-Incremental Learning with Temporal Prompting and Regularization
AAAI 2025
FSBench: A Figure Skating Benchmark for Advancing Artistic Sports Understanding
CVPR 2025
MoEdit: On Learning Quantity Perception for Multi-object Image Editing
CVPR 2025
Dynamic Collaboration of Multi-Language Models based on Minimal Complete Semantic Units
EMNLP 2025
Enhancing Adversarial Transferability by Balancing Exploration and Exploitation with Gradient-Guided Sampling
ICCV 2025
Kronecker Mask and Interpretive Prompts are Language-Action Video Learners
ICLR 2025
Multimodal Fake News Detection: MFND Dataset and Shallow-Deep Multitask Learning
IJCAI 2025
BiMSRec: A Progressive Image Reconstruction Framework for Medical Image Fusion Guided by Multi-Scale Deformation Fields
MICCAI 2025
MedIQA: A Scalable Foundation Model for Prompt-Driven Medical Image Quality Assessment
MICCAI 2025
TRRG: Towards Truthful Radiology Report Generation With Cross-modal Disease Clue Enhanced Large Language Models
MICCAI 2025
DiffFAS: Face Anti-Spoofing via Generative Diffusion Models
ECCV 2024
MTaDCS: Moving Trace and Feature Density-based Confidence Sample Selection under Label Noise
ECCV 2024
Suppress and Rebalance: Towards Generalized Multi-Modal Face Anti-Spoofing
CVPR 2024
CAT: Enhancing Multimodal Large Language Model to Answer Questions in Dynamic Audio-Visual Scenarios
ECCV 2024
AUFormer: Vision Transformers are Parameter-Efficient Facial Action Unit Detectors
ECCV 2024
Neuron Structure Modeling for Generalizable Remote Physiological Measurement
CVPR 2023
Audio-Visual Deception Detection: DOLOS Dataset and Parameter-Efficient Crossmodal Learning
ICCV 2023
Learning Motion-Robust Remote Photoplethysmography through Arbitrary Resolution Videos
AAAI 2023
Rehearsal-Free Domain Continual Face Anti-Spoofing: Generalize More and Forget Less
ICCV 2023
Domain Generalization via Shuffled Style Assembly for Face Anti-Spoofing
CVPR 2022
PhysFormer: Facial Video-Based Physiological Measurement With Temporal Difference Transformer
CVPR 2022
IDPT: Interconnected Dual Pyramid Transformer for Face Super-Resolution
IJCAI 2022
Geometry-Contrastive Transformer for Generalized 3D Pose Transfer
AAAI 2022
Non-contact Pain Recognition from Video Sequences with Remote Physiological Measurements Prediction
IJCAI 2021
iMiGUE: An Identity-Free Video Dataset for Micro-Gesture Understanding and Emotion Analysis
CVPR 2021
Pixel Difference Networks for Efficient Edge Detection
ICCV 2021
Dual-Cross Central Difference Network for Face Anti-Spoofing
IJCAI 2021
Video-based Remote Physiological Measurement via Cross-verified Feature Disentangling
ECCV 2020
Searching Central Difference Convolutional Networks for Face Anti-Spoofing
CVPR 2020
Face Anti-Spoofing with Human Material Perception
ECCV 2020
Deep Spatial Gradient and Temporal Depth Learning for Face Anti-Spoofing
CVPR 2020
Learning Meta Model for Zero- and Few-Shot Face Anti-Spoofing
AAAI 2020
Remote Heart Rate Measurement From Highly Compressed Facial Videos: An End-to-End Deep Learning Solution With Video Enhancement
ICCV 2019