Shentong Mo
25 papers · 2022–2025 · 8 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+6 more ↓ Show less ↑
π Cross-Pollinator (13) π§ Keyword Pioneer π Interdisciplinary Bridge π Conference Polyglot (8) π Renaissance Researcher (6)
π
Renaissance Researcher
(6)
πΊοΈ
Taxonomy Completionist
(35)
π
Grand Slam
π
Century Club
(25)
ποΈ
Keyword Collector
(94)
β‘
Prolific Year
(9)
Conferences
NIPS (8)
ECCV (5)
CVPR (3)
AAAI (2)
ICCV (2)
ICML (2)
WACV (2)
ICLR (1)
Top co-authors
Research topics
Keywords
audio-visual learning
(5)
contrastive learning
(5)
multimodal learning
(3)
self-supervised learning
(3)
representation learning
(2)
vision transformer
(2)
point cloud
(2)
generative model
(2)
masked modeling
(2)
diffusion model
(2)
image generation
(2)
3d shape generation
(2)
multi-modal learning
(2)
weakly supervised learning
(2)
class-incremental learning
(2)
continual learning
(2)
point cloud generation
(1)
catastrophic forgetting
(1)
source separation
(1)
cross-modal learning
(1)
Papers
pMoE: Prompting Diverse Experts Together Wins More in Visual Adaptation
ICLR 2025
Scaling Diffusion Mamba with Bidirectional SSMs for Efficient 3D Shape Generation
AAAI 2025
The Dynamic Duo of Collaborative Masking and Target for Advanced Masked Autoencoder Learning
AAAI 2025
GMAIL: Generative Modality Alignment for generated Image Learning
ICML 2025
Foley-Flow: Coordinated Video-to-Audio Generation with Masked Audio-Visual Alignment and Dynamic Conditional Flows
CVPR 2025
Aligning Audio-Visual Joint Representations with an Agentic Workflow
NIPS 2024
Continual Audio-Visual Sound Separation
NIPS 2024
Unveiling the Power of Audio-Visual Early Fusion Transformers with Dense Interactions through Masked Modeling
CVPR 2024
Fast Training of Diffusion Transformer with Extreme Masking for 3D Point Clouds Generation
ECCV 2024
Audio-Synchronized Visual Animation
ECCV 2024
Audio-visual Generalized Zero-shot Learning the Easy Way
ECCV 2024
Connecting Joint-Embedding Predictive Architecture with Contrastive Self-supervised Learning
NIPS 2024
Representation Disentanglement in Generative Models With Contrastive Learning
WACV 2023
Weakly-Supervised Audio-Visual Segmentation
NIPS 2023
DiT-3D: Exploring Plain Diffusion Transformers for 3D Shape Generation
NIPS 2023
DiffComplete: Diffusion-based Generative 3D Shape Completion
NIPS 2023
Audio-Visual Grouping Network for Sound Localization From Mixtures
CVPR 2023
Class-Incremental Grouping Network for Continual Audio-Visual Learning
ICCV 2023
Audio-Visual Class-Incremental Learning
ICCV 2023
A Unified Audio-Visual Learning Framework for Localization, Separation, and Recognition
ICML 2023
Multi-Level Contrastive Learning for Self-Supervised Vision Transformers
WACV 2023
Localizing Visual Sounds the Easy Way
ECCV 2022
Multi-modal Grouping Network for Weakly-Supervised Audio-Visual Video Parsing
NIPS 2022
A Closer Look at Weakly-Supervised Audio-Visual Source Localization
NIPS 2022
"Unitail: Detecting, Reading, and Matching in Retail Scene"
ECCV 2022