conftrace
_
Papers
Trends
Conferences
Explore
Authors
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
← Core AI
Artificial Intelligence
›
Core AI
›
Multimodal Learning
13,057 papers
Papers per year
2003: 1
2006: 3
2007: 6
2008: 2
2009: 5
2010: 2
2011: 3
2012: 6
2013: 24
2014: 20
2015: 46
2016: 109
2017: 205
2018: 299
2019: 622
2020: 675
2021: 987
2022: 1084
2023: 1697
2024: 2500
2025: 3654
2026: 1107
Papers
SHuBERT: Self-Supervised Sign Language Representation Learning via Multi-Stream Cluster Prediction
ACL 2025
Performance Gap in Entity Knowledge Extraction Across Modalities in Vision Language Models
ACL 2025
CoachMe: Decoding Sport Elements with a Reference-Based Coaching Instruction Generation Model
ACL 2025
FinMME: Benchmark Dataset for Financial Multi-Modal Reasoning Evaluation
ACL 2025
EditInspector: A Benchmark for Evaluation of Text-Guided Image Edits
ACL 2025
Synergizing Unsupervised Episode Detection with LLMs for Large-Scale News Events
ACL 2025
Beyond True or False: Retrieval-Augmented Hierarchical Analysis of Nuanced Claims
ACL 2025
VISA: Retrieval Augmented Generation with Visual Source Attribution
ACL 2025
Symmetrical Visual Contrastive Optimization: Aligning Vision-Language Models with Minimal Contrastive Images
ACL 2025
SpeechIQ: Speech-Agentic Intelligence Quotient Across Cognitive Levels in Voice Understanding by Large Language Models
ACL 2025
ViGiL3D: A Linguistically Diverse Dataset for 3D Visual Grounding
ACL 2025
Benchmarking and Improving Large Vision-Language Models for Fundamental Visual Graph Understanding and Reasoning
ACL 2025
Activating Distributed Visual Region within LLMs for Efficient and Effective Vision-Language Training and Inference
ACL 2025
CCHall: A Novel Benchmark for Joint Cross-Lingual and Cross-Modal Hallucinations Detection in Large Language Models
ACL 2025
Multi-Modality Expansion and Retention for LLMs through Parameter Merging and Decoupling
ACL 2025
IMOL: Incomplete-Modality-Tolerant Learning for Multi-Domain Fake News Video Detection
ACL 2025
Hidden in Plain Sight: Evaluation of the Deception Detection Capabilities of LLMs in Multimodal Settings
ACL 2025
HintsOfTruth: A Multimodal Checkworthiness Detection Dataset with Real and Synthetic Claims
ACL 2025
It’s Not a Walk in the Park! Challenges of Idiom Translation in Speech-to-text Systems
ACL 2025
A Parameter-Efficient and Fine-Grained Prompt Learning for Vision-Language Models
ACL 2025
Enabling Chatbots with Eyes and Ears: An Immersive Multimodal Conversation System for Dynamic Interactions
ACL 2025
Multimodal Coreference Resolution for Chinese Social Media Dialogues: Dataset and Benchmark Approach
ACL 2025
REAL-MM-RAG: A Real-World Multi-Modal Retrieval Benchmark
ACL 2025
UrbanVideo-Bench: Benchmarking Vision-Language Models on Embodied Intelligence with Video Data in Urban Spaces
ACL 2025
HELIOS: Harmonizing Early Fusion, Late Fusion, and LLM Reasoning for Multi-Granular Table-Text Retrieval
ACL 2025
<
1
…
71
72
73
…
523
>