Papers
Can I trust You? LLMs as conversational agents
Marc Döbler, Raghavendran Mahendravarman, Anna Moskvina et al.
Emulating Author Style: A Feasibility Study of Prompt-enabled Text Stylization with Off-the-Shelf LLMs
Avanti Bhandarkar, Ronald Wilson, Anushka Swarup et al.
LLMs Simulate Big5 Personality Traits: Further Evidence
Aleksandra Sorokovikova, Sharwin Rezagholi, Natalia Fedorova et al.
Quantifying learning-style adaptation in effectiveness of LLM teaching
Ruben Weijers, Gabrielle Fidelis de Castilho, Jean-François Godbout et al.
RAGs to Style: Personalizing LLMs with Style Embeddings
Abhiman Neelakanteswara, Shreyas Chaudhari, Hamed Zamani
Aligning Uncertainty: Leveraging LLMs to Analyze Uncertainty Transfer in Text Summarization
Zahra Kolagar, Alessandra Zarcone
Meta-Prompting for Automating Zero-shot Visual Recognition with LLMs
Muhammad Jehanzeb Mirza, Leonid Karlinsky, Wei Lin et al.
Merlin: Empowering Multimodal LLMs with Foresight Minds
En Yu, Liang Zhao, YANA WEI et al.
LLM as Copilot for Coarse-grained Vision-and-Language Navigation
Yanyuan Qiao, Qianyi Liu, Jiajun Liu et al.
MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?
Renrui Zhang, Dongzhi Jiang, Yichi Zhang et al.
"Eyes Closed, Safety On: Protecting Multimodal LLMs via Image-to-Text Transformation"
Yunhao Gou, Kai Chen, Zhili LIU et al.
"Propose, Assess, Search: Harnessing LLMs for Goal-Oriented Planning in Instructional Videos"
Md Mohaiminul Islam, Tushar Nagarajan, Huiyu Wang et al.
CrossGLG: LLM Guides One-shot Skeleton-based 3D Action Recognition in a Cross-level Manner
Tingbing Yan, Wenzheng Zeng, Yang Xiao et al.
"MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training"
Brandon McKinzie, Zhe Gan, Jean-Philippe Fauconnier et al.
LLM as Dataset Analyst: Subpopulation Structure Discovery with Large Language Model
Yulin Luo, Ruichuan An, Bocheng Zou et al.
Global-Local Collaborative Inference with LLM for Lidar-Based Open-Vocabulary Detection
Xingyu Peng, Yan Bai, Chen Gao et al.
LLMGA: Multimodal Large Language Model based Generation Assistant
bin xia, Shiyin Wang, Yingfan Tao et al.
MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation
Kunpeng Song, Yizhe Zhu, Bingchen Liu et al.
"X-InstructBLIP: A Framework for Aligning Image, 3D, Audio, Video to LLMs and its Emergent Cross-modal Reasoning"
Artemis Panagopoulou, Le Xue, Ning Yu et al.
How Many Unicorns Are in This Image? A Safety Evaluation Benchmark for Vision LLMs
Haoqin Tu, Chenhang Cui, Zijun Wang et al.
HowToCaption: Prompting LLMs to Transform Video Annotations at Scale
Nina Shvetsova, Anna Kukleva, Xudong Hong et al.
ST-LLM: Large Language Models Are Effective Temporal Learners
Ruyang Liu, Chen Li, Haoran Tang et al.
Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs
Keen You, Haotian Zhang, Eldon Schoop et al.
Zero-shot Text-guided Infinite Image Synthesis with LLM guidance
Soyeong Kwon, Taegyeong Lee, Taehwan Kim