Dan Guo
38 papers · 2019–2026 · 7 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+11 more ↓ Show less ↑
π Academic Marathon (6) π Interdisciplinary Bridge π Conference Polyglot (6) π§ Keyword Pioneer π£ Hot Topic Early Bird
π
Cross-Pollinator
(14)
π
Conference Polyglot
(6)
π
Academic Marathon
(6)
π
Conference Loyalist
(20)
π€
Dynamic Duo
(21)
π¬
Deep Specialist
(11)
π
Keyword Champion
(3)
ποΈ
Keyword Collector
(201)
β‘
Prolific Year
(9)
π
Century Club
(31)
π
Trend Setter
Conferences
AAAI (20)
CVPR (7)
IJCAI (4)
ECCV (3)
ICCV (2)
ACL (1)
NAACL (1)
Top co-authors
Keywords
video understanding
(6)
multimodal learning
(4)
contrastive learning
(4)
diffusion model
(4)
audio-visual event localization
(3)
temporal localization
(3)
event localization
(2)
temporal convolution
(2)
visual dialog
(2)
video processing
(2)
cross-modal learning
(2)
knowledge distillation
(2)
pose estimation
(2)
graph neural network
(2)
attention mechanism
(2)
object tracking
(2)
articulated object
(2)
multi-modal learning
(2)
visual question answering
(2)
audio-visual question answering
(2)
Papers
Bidirectional Counterfactual Distillation for Review-Based Recommendation
AAAI 2026
CLASP: Cross-modal Salient Anchor-based Semantic Propagation for Weakly-supervised Dense Audio-Visual Event Localization
AAAI 2026
Psyche-R1: Towards Reliable Psychological LLMs through Unified Empathy, Expertise, and Reasoning
ACL 2026
AgentMental: An Interactive Multi-Agent Framework for Explainable and Adaptive Mental Health Assessment
AAAI 2026
SIAM: Towards Generalizable Articulated Object Modeling via Single Robot-Object Interaction
AAAI 2026
LinProVSR: Linguistics-Knowledge Guided Progressive Disambiguation Network for Visual Speech Recognition
AAAI 2026
A Closer Look at Knowledge Distillation in Spiking Neural Network Training
AAAI 2026
Multimodal Class-aware Semantic Enhancement Network for Audio-Visual Video Parsing
AAAI 2025
Dense Audio-Visual Event Localization Under Cross-Modal Consistency and Multi-Temporal Granularity Collaboration
AAAI 2025
Discrete to Continuous: Generating Smooth Transition Poses from Sign Language Observations
CVPR 2025
MOL-Mamba: Enhancing Molecular Representation with Structural & Electronic Insights
AAAI 2025
Prototypical Calibrating Ambiguous Samples for Micro-Action Recognition
AAAI 2025
Patch-level Sounding Object Tracking for Audio-Visual Question Answering
AAAI 2025
PhysDiff: Physiology-based Dynamicity Disentangled Diffusion Model for Remote Physiological Measurement
AAAI 2025
Sign-IDD: Iconicity Disentangled Diffusion for Sign Language Production
AAAI 2025
AugRefer: Advancing 3D Visual Grounding via Cross-Modal Augmentation and Spatial Relation-based Referring
AAAI 2025
ASAP: Advancing Semantic Alignment Promotes Multi-Modal Manipulation Detecting and Grounding
CVPR 2025
EgoTextVQA: Towards Egocentric Scene-Text Aware Video Question Answering
CVPR 2025
Towards Open-Vocabulary Audio-Visual Event Localization
CVPR 2025
MMAD: Multi-label Micro-Action Detection in Videos
ICCV 2025
Moderating the Generalization of Score-based Generative Model
ICCV 2025
Label-anticipated Event Disentanglement for Audio-Visual Video Parsing
ECCV 2024
Data-Free Quantization via Pseudo-label Filtering
CVPR 2024
Training A Small Emotional Vision Language Model for Visual Art Comprehension
ECCV 2024
KPA-Tracker: Towards Robust and Real-Time Category-Level Articulated Object 6D Pose Tracking
AAAI 2024
Object-Aware Adaptive-Positivity Learning for Audio-Visual Question Answering
AAAI 2024
EulerMormer: Robust Eulerian Motion Magnification via Dynamic Filtering within Transformer
AAAI 2024
Text-Based Occluded Person Re-identification via Multi-Granularity Contrastive Consistency Learning
AAAI 2024
Towards Understanding Future: Consistency Guided Probabilistic Modeling for Action Anticipation
AAAI 2024
Frequency Decoupling for Motion Magnification via Multi-Level Isomorphic Architecture
CVPR 2024
AudioβVisual Segmentation
ECCV 2022
A Label-Aware Autoregressive Framework for Cross-Domain NER
NAACL 2022
Proposal-Free Video Grounding with Contextual Pyramid Network
AAAI 2021
Iterative Context-Aware Graph Inference for Visual Dialog
CVPR 2020
Recurrent Relational Memory Network for Unsupervised Image Captioning
IJCAI 2020
Dual Visual Attention Network for Visual Dialog
IJCAI 2019
Connectionist Temporal Modeling of Video and Language: a Joint Model for Translation and Sign Labeling
IJCAI 2019
Dense Temporal Convolution Network for Sign Language Translation
IJCAI 2019