conftrace
_
Papers
Trends
Conferences
Explore
Authors
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
← Core AI
Artificial Intelligence
›
Core AI
›
Multimodal Learning
13,057 papers
Papers per year
2003: 1
2006: 3
2007: 6
2008: 2
2009: 5
2010: 2
2011: 3
2012: 6
2013: 24
2014: 20
2015: 46
2016: 109
2017: 205
2018: 299
2019: 622
2020: 675
2021: 987
2022: 1084
2023: 1697
2024: 2500
2025: 3654
2026: 1107
Papers
Leveraging Taxonomy and LLMs for Improved Multimodal Hierarchical Classification
COLING 2025
Representation Purification for End-to-End Speech Translation
COLING 2025
Acquired TASTE: Multimodal Stance Detection with Textual and Structural Embeddings
COLING 2025
On the Effects of Fine-tuning Language Models for Text-Based Reinforcement Learning
COLING 2025
RRHF-V: Ranking Responses to Mitigate Hallucinations in Multimodal Large Language Models with Human Feedback
COLING 2025
Fine-Grained Features-based Code Search for Precise Query-Code Matching
COLING 2025
VideoQA-TA: Temporal-Aware Multi-Modal Video Question Answering
COLING 2025
Evolver: Chain-of-Evolution Prompting to Boost Large Multimodal Models for Hateful Meme Detection
COLING 2025
Piecing It All Together: Verifying Multi-Hop Multimodal Claims
COLING 2025
Charting the Future: Using Chart Question-Answering for Scalable Evaluation of LLM-Driven Data Visualizations
COLING 2025
Less is More: A Simple yet Effective Token Reduction Method for Efficient Multi-modal LLMs
COLING 2025
ProsodyFlow: High-fidelity Text-to-Speech through Conditional Flow Matching and Prosody Modeling with Large Speech Language Models
COLING 2025
SGMEA: Structure-Guided Multimodal Entity Alignment
COLING 2025
Multilingual and Explainable Text Detoxification with Parallel Corpora
COLING 2025
What Makes for Good Visual Instructions? Synthesizing Complex Visual Reasoning Instructions for Visual Instruction Tuning
COLING 2025
TriFine: A Large-Scale Dataset of Vision-Audio-Subtitle for Tri-Modal Machine Translation and Benchmark with Fine-Grained Annotated Tags
COLING 2025
CmEAA: Cross-modal Enhancement and Alignment Adapter for Radiology Report Generation
COLING 2025
Semantic Reshuffling with LLM and Heterogeneous Graph Auto-Encoder for Enhanced Rumor Detection
COLING 2025
Multi-Modal Entities Matter: Benchmarking Multi-Modal Entity Alignment
COLING 2025
From Traits to Empathy: Personality-Aware Multimodal Empathetic Response Generation
COLING 2025
Integrating Visual Modalities with Large Language Models for Mental Health Support
COLING 2025
OVEL: Online Video Entity Linking
COLING 2025
Towards Multilingual spoken Visual Question Answering system using Cross-Attention
COLING 2025
Generation-Based and Emotion-Reflected Memory Update: Creating the KEEM Dataset for Better Long-Term Conversation
COLING 2025
CACA: Context-Aware Cross-Attention Network for Extractive Aspect Sentiment Quad Prediction
COLING 2025
<
1
…
88
89
90
…
523
>