Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Keywords
cross-modal learning
521 papers
Explore in graph
Also known as
CMP
C3HOST
Co-occurring keywords
multimodal learning
(4622)
contrastive learning
(3979)
knowledge distillation
(3680)
representation learning
(6174)
multi-modal learning
(1276)
vision-language model
(2235)
self-supervised learning
(3751)
domain adaptation
(4578)
video understanding
(1647)
zero-shot learning
(3637)
Papers
iMoT: Inertial Motion Transformer for Inertial Navigation
AAAI 2025
UniDxMD: Towards Unified Representation for Cross-Modal Unsupervised Domain Adaptation in 3D Semantic Segmentation
ICCV 2025
CmEAA: Cross-modal Enhancement and Alignment Adapter for Radiology Report Generation
COLING 2025
Towards Multilingual spoken Visual Question Answering system using Cross-Attention
COLING 2025
Less for More: Enhanced Feedback-aligned Mixed LLMs for Molecule Caption Generation and Fine-Grained NLI Evaluation
ACL 2025
WildSAT: Learning Satellite Image Representations from Wildlife Observations
ICCV 2025
Electron Density-enhanced Molecular Geometry Learning
IJCAI 2025
Towards Cross-Modality Modeling for Time Series Analytics: A Survey in the LLM Era
IJCAI 2025
CLIP-driven View-aware Prompt Learning for Unsupervised Vehicle Re-identification
AAAI 2025
Fine-Grained Spatial and Verbal Losses for 3D Visual Grounding
WACV 2025
Learning Visual Proxy for Compositional Zero-Shot Learning
ICCV 2025
Lyra: An Efficient and Speech-Centric Framework for Omni-Cognition
ICCV 2025
Cross-modal Collaborative Representation Learning for Text-to-Image Person Retrieval
IJCAI 2025
Bidirectional Multi-Step Domain Generalization for Visible-Infrared Person Re-Identification
WACV 2025
VILLS : Video-Image Learning to Learn Semantics for Person Re-Identification
WACV 2025
GLEAM: Enhanced Transferable Adversarial Attacks for Vision-Language Pre-training Models via Global-Local Transformations
ICCV 2025
TokenBinder: Text-Video Retrieval with One-to-Many Alignment Paradigm
WACV 2025
Seeking Proxy Point via Stable Feature Space for Noisy Correspondence Learning
IJCAI 2025
GlobalDoc: A Cross-Modal Vision-Language Framework for Real-World Document Image Retrieval and Classification
WACV 2025
Meta-Learning for Color-to-Infrared Cross-Modal Style Transfer
WACV 2025
Cross-Modal Learning for Music-to-Music-Video Description Generation
NAACL 2025
SSN_MMHS@DravidianLangTech 2025: A Dual Transformer Approach for Multimodal Hate Speech Detection in Dravidian Languages
NAACL 2025
PHGC: Procedural Heterogeneous Graph Completion for Natural Language Task Verification in Egocentric Videos
CVPR 2025
HiGarment: Cross-modal Harmony Based Diffusion Model for Flat Sketch to Realistic Garment Image
ICCV 2025
DTW-Align: Bridging the Modality Gap in End-to-End Speech Translation with Dynamic Time Warping Alignment
EMNLP 2025
<
1
2
3
4
5
…
21
>