← Learning Types

Machine Learning › Learning Types ›

Multi-Modal Learning

1213 directly classified papers

Papers per year

Papers

Expedited Training of Visual Conditioned Language Generation via Redundancy Reduction ACL 2024

How Do Conversational Agents in Healthcare Impact on Patient Agency? EACL 2024

cantnlp@LT-EDI-2024: Automatic Detection of Anti-LGBTQ+ Hate Speech in Under-resourced Languages EACL 2024

Sample-Level Cross-View Similarity Learning for Incomplete Multi-View Clustering AAAI 2024

Mementos: A Comprehensive Benchmark for Multimodal Large Language Model Reasoning over Image Sequences ACL 2024

MemeMQA: Multimodal Question Answering for Memes via Rationale-Based Inferencing ACL 2024

A Mapping on Current Classifying Categories of Emotions Used in Multimodal Models for Emotion Recognition EACL 2024

Extended Multimodal Hate Speech Event Detection During Russia-Ukraine Crisis - Shared Task at CASE 2024 EACL 2024

CUET_Binary_Hackers at ClimateActivism 2024: A Comprehensive Evaluation and Superior Performance of Transformer-Based Models in Hate Speech Event Detection and Stance Classification for Climate Activism EACL 2024

MasonPerplexity at ClimateActivism 2024: Integrating Advanced Ensemble Techniques and Data Augmentation for Climate Activism Stance and Hate Event Identification EACL 2024

MasonPerplexity at Multimodal Hate Speech Event Detection 2024: Hate Speech and Target Detection Using Transformer Ensembles EACL 2024

Exploring hybrid approaches to readability: experiments on the complementarity between linguistic features and transformers EACL 2024

Incomplete Contrastive Multi-View Clustering with High-Confidence Guiding AAAI 2024

ECHO-GL: Earnings Calls-Driven Heterogeneous Graph Learning for Stock Movement Prediction AAAI 2024

MDGNN: Multi-Relational Dynamic Graph Neural Network for Comprehensive and Dynamic Stock Investment Prediction AAAI 2024

AFL-Net: Integrating Audio, Facial, and Lip Modalities with a Two-step Cross-attention for Robust Speaker Diarization in the Wild INTERSPEECH 2024

MEEL: Multi-Modal Event Evolution Learning ACL 2024

IITK at SemEval-2024 Task 10: Who is the speaker? Improving Emotion Recognition and Flip Reasoning in Conversations via Speaker Embeddings SEMEVAL 2024

NUS-Emo at SemEval-2024 Task 3: Instruction-Tuning LLM for Multimodal Emotion-Cause Analysis in Conversations SEMEVAL 2024

JMI at SemEval 2024 Task 3: Two-step approach for multimodal ECAC using in-context learning with GPT and instruction-tuned Llama models SEMEVAL 2024

Relational Programming with Foundational Models AAAI 2024

Spatial-Related Sensors Matters: 3D Human Motion Reconstruction Assisted with Textual Semantics AAAI 2024

MultiVerse: Efficient and Expressive Zero-Shot Multi-Task Text-to-Speech EMNLP 2024

CAMEL: Capturing Metaphorical Alignment with Context Disentangling for Multimodal Emotion Recognition AAAI 2024

Unraveling Babel: Exploring Multilingual Activation Patterns of LLMs and Their Applications EMNLP 2024