Co-occurring keywords
Papers
Speech ReaLLM – Real-time Speech Recognition with Multimodal Language Models by Teaching the Flow of Time
INTERSPEECH 2024
A User-Friendly Framework for Generating Model-Preferred Prompts in Text-to-Image Synthesis
AAAI 2024
Exploring Curriculum Learning for Vision-Language Tasks: A Study on Small-Scale Multimodal Training
CONLL 2024