cross-modal learning

521 papers

Explore in graph

Also known as

CMP C3HOST

Co-occurring keywords

multimodal learning (4622) contrastive learning (3979) knowledge distillation (3680) representation learning (6174) multi-modal learning (1276) vision-language model (2235) self-supervised learning (3751) domain adaptation (4578) video understanding (1647) zero-shot learning (3637)

Papers

Steering Representations, Safeguarding Privacy: A Cross-Modal Privacy Protection Method for Generative AI AAAI 2026

AXON: Action Characterization Through Cross-Modal Knowledge Distillation for Neurodiverse Individuals AAAI 2026

Mitigating Endogenous Confirmation Bias in Noisy Label Learning for Vision-Language Models AAAI 2026

Gotta Hear Them All: Towards Sound Source Aware Audio Generation AAAI 2026

DHCM-CACL: Dynamic Hierarchical Cross-modal Mamba with Confidence-Adaptive Contrastive Learning for Multimodal Emotion Recognition AAAI 2026

Distilling Cross-Modal Knowledge into Domain-Specific Retrievers for Enhanced Industrial Document Understanding EMNLP 2025

CLIP-driven View-aware Prompt Learning for Unsupervised Vehicle Re-identification AAAI 2025

PHGC: Procedural Heterogeneous Graph Completion for Natural Language Task Verification in Egocentric Videos CVPR 2025

Learning to See through Sound: From VggCaps to Multi2Cap for Richer Automated Audio Captioning EMNLP 2025

Less for More: Enhanced Feedback-aligned Mixed LLMs for Molecule Caption Generation and Fine-Grained NLI Evaluation ACL 2025

Zero-shot Multimodal Document Retrieval via Cross-modal Question Generation EMNLP 2025

MINIMA: Modality Invariant Image Matching CVPR 2025

CNC: Cross-modal Normality Constraint for Unsupervised Multi-class Anomaly Detection AAAI 2025

iMoT: Inertial Motion Transformer for Inertial Navigation AAAI 2025

NeKo: Cross-Modality Post-Recognition Error Correction with Tasks-Guided Mixture-of-Experts Language Model ACL 2025

Language-Guided Audio-Visual Learning for Long-Term Sports Assessment CVPR 2025

A Cross-Modal Densely Guided Knowledge Distillation Based on Modality Rebalancing Strategy for Enhanced Unimodal Emotion Recognition IJCAI 2025

SSLFusion: Scale and Space Aligned Latent Fusion Model for Multimodal 3D Object Detection AAAI 2025

Bridging Semantic and Modality Gaps in Zero-Shot Captioning via Retrieval from Synthetic Data EMNLP 2025

CTYUN-AI at SemEval-2025 Task 1: Learning to Rank for Idiomatic Expressions SEMEVAL 2025

VideoVAE+: Large Motion Video Autoencoding with Cross-modal Video VAE ICCV 2025

Graph-Based Cross-Domain Knowledge Distillation for Cross-Dataset Text-to-Image Person Retrieval AAAI 2025

Generative Planning with 3D-Vision Language Pre-training for End-to-End Autonomous Driving AAAI 2025

Omni-Query Active Learning for Source-Free Domain Adaptive Cross-Modality 3D Semantic Segmentation AAAI 2025

UniDxMD: Towards Unified Representation for Cross-Modal Unsupervised Domain Adaptation in 3D Semantic Segmentation ICCV 2025