conftrace
_
Papers
Trends
Conferences
Explore
Authors
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
← Core AI
Artificial Intelligence
›
Core AI
›
Multimodal Learning
13,057 papers
Papers per year
2003: 1
2006: 3
2007: 6
2008: 2
2009: 5
2010: 2
2011: 3
2012: 6
2013: 24
2014: 20
2015: 46
2016: 109
2017: 205
2018: 299
2019: 622
2020: 675
2021: 987
2022: 1084
2023: 1697
2024: 2500
2025: 3654
2026: 1107
Papers
Beyond Verbal Cues: Emotional Contagion Graph Network for Causal Emotion Entailment
ACL 2025
‘No’ Matters: Out-of-Distribution Detection in Multimodality Multi-Turn Interactive Dialogue Download PDF
ACL 2025
Double Entendre: Robust Audio-Based AI-Generated Lyrics Detection via Multi-View Fusion
ACL 2025
Don’t Miss the Forest for the Trees: Attentional Vision Calibration for Large Vision Language Models
ACL 2025
Chain-Talker: Chain Understanding and Rendering for Empathetic Conversational Speech Synthesis
ACL 2025
Harnessing PDF Data for Improving Japanese Large Multimodal Models
ACL 2025
SLAM-Omni: Timbre-Controllable Voice Interaction System with Single-Stage Training
ACL 2025
Texts or Images? A Fine-grained Analysis on the Effectiveness of Input Representations and Models for Table Question Answering
ACL 2025
Enhancing Multimodal Unified Representations for Cross Modal Generalization
ACL 2025
Failing Forward: Improving Generative Error Correction for ASR with Synthetic Data and Retrieval Augmentation
ACL 2025
MathCoder-VL: Bridging Vision and Code for Enhanced Multimodal Mathematical Reasoning
ACL 2025
CLaMP 3: Universal Music Information Retrieval Across Unaligned Modalities and Unseen Languages
ACL 2025
BrainECHO: Semantic Brain Signal Decoding through Vector-Quantized Spectrogram Reconstruction for Whisper-Enhanced Text Generation
ACL 2025
Progressive LoRA for Multimodal Continual Instruction Tuning
ACL 2025
Reasoning is All You Need for Video Generalization: A Counterfactual Benchmark with Sub-question Evaluation
ACL 2025
VSCBench: Bridging the Gap in Vision-Language Model Safety Calibration
ACL 2025
MANBench: Is Your Multimodal Model Smarter than Human?
ACL 2025
mOSCAR: A Large-scale Multilingual and Multimodal Document-level Corpus
ACL 2025
DALR: Dual-level Alignment Learning for Multimodal Sentence Representation Learning
ACL 2025
ChartEdit: How Far Are MLLMs From Automating Chart Analysis? Evaluating MLLMs’ Capability via Chart Editing
ACL 2025
Unraveling and Mitigating Safety Alignment Degradation of Vision-Language Models
ACL 2025
Turbocharging Web Automation: The Impact of Compressed History States
ACL 2025
Making RALM Robust to Irrelevant Contexts via Layer Knowledge Guided Attention
ACL 2025
SignAlignLM: Integrating Multimodal Sign Language Processing into Large Language Models
ACL 2025
NegVQA: Can Vision Language Models Understand Negation?
ACL 2025
<
1
…
74
75
76
…
523
>