conftrace
_
Papers
Trends
Conferences
Explore
Authors
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
← Core AI
Artificial Intelligence
›
Core AI
›
Multimodal Learning
13,057 papers
Papers per year
2003: 1
2006: 3
2007: 6
2008: 2
2009: 5
2010: 2
2011: 3
2012: 6
2013: 24
2014: 20
2015: 46
2016: 109
2017: 205
2018: 299
2019: 622
2020: 675
2021: 987
2022: 1084
2023: 1697
2024: 2500
2025: 3654
2026: 1107
Papers
Sentimental Image Generation for Aspect-based Sentiment Analysis
ACL 2025
Investigating Prosodic Signatures via Speech Pre-Trained Models for Audio Deepfake Source Attribution
ACL 2025
MATCHED: Multimodal Authorship-Attribution To Combat Human Trafficking in Escort-Advertisement Data
ACL 2025
Vision Language Model Helps Private Information De-Identification in Vision Data
ACL 2025
Unveiling Privacy Risks in Multi-modal Large Language Models: Task-specific Vulnerabilities and Mitigation Challenges
ACL 2025
MM-R3: On (In-)Consistency of Vision-Language Models (VLMs)
ACL 2025
Shadow-Activated Backdoor Attacks on Multimodal Large Language Models
ACL 2025
Why Vision Language Models Struggle with Visual Arithmetic? Towards Enhanced Chart and Geometry Understanding
ACL 2025
SynFix: Dependency-Aware Program Repair via RelationGraph Analysis
ACL 2025
AdaV: Adaptive Text-visual Redirection for Vision-Language Models
ACL 2025
ASPO: Adaptive Sentence-Level Preference Optimization for Fine-Grained Multimodal Reasoning
ACL 2025
MIRe: Enhancing Multimodal Queries Representation via Fusion-Free Modality Interaction for Multimodal Retrieval
ACL 2025
AdaReTaKe: Adaptive Redundancy Reduction to Perceive Longer for Video-language Understanding
ACL 2025
Multimodal Causal Reasoning Benchmark: Challenging Multimodal Large Language Models to Discern Causal Links Across Modalities
ACL 2025
VCD: A Dataset for Visual Commonsense Discovery in Images
ACL 2025
ProMedTS: A Self-Supervised, Prompt-Guided Multimodal Approach for Integrating Medical Text and Time Series
ACL 2025
Reefknot: A Comprehensive Benchmark for Relation Hallucination Evaluation, Analysis and Mitigation in Multimodal Large Language Models
ACL 2025
Explorer: Scaling Exploration-driven Web Trajectory Synthesis for Multimodal Web Agents
ACL 2025
Advancing General Multimodal Capability of Vision-language Models with Pyramid-descent Visual Position Encoding
ACL 2025
EssayJudge: A Multi-Granular Benchmark for Assessing Automated Essay Scoring Capabilities of Multimodal Large Language Models
ACL 2025
Self-Correction is More than Refinement: A Learning Framework for Visual and Language Reasoning Tasks
ACL 2025
IW-Bench: Evaluating Large Multimodal Models for Converting Image-to-Web
ACL 2025
DeepRTL2: A Versatile Model for RTL-Related Tasks
ACL 2025
Cross-lingual Multimodal Sentiment Analysis for Low-Resource Languages via Language Family Disentanglement and Rethinking Transfer
ACL 2025
InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model
ACL 2025
<
1
…
75
76
77
…
523
>