Papers
4,428 papers found
MLLM-LLaVA-FL: Multimodal Large Language Model Assisted Federated Learning
Jianyi Zhang, Hao Yang, Ang Li et al.
MLLM-Tool: A Multimodal Large Language Model for Tool Agent Learning
Chenyu Wang, Weixin Luo, Sixun Dong et al.
Modality-Incremental Learning with Disjoint Relevance Mapping Networks for Image-Based Semantic Segmentation
Niharika Hegde, Shishir Muralidhara, René Schuster et al.
Moment of Untruth: Dealing with Negative Queries in Video Moment Retrieval
Kevin Flanagan, Dima Damen, Michael Wray
MONAS-ESNN: Multi-Objective Neural Architecture Search for Efficient Spiking Neural Networks
Esmat Ghasemi Saghand, Susana K. Lai-Yuen
MonoPP: Metric-Scaled Self-Supervised Monocular Depth Estimation by Planar-Parallax Geometry in Automotive Applications
Gasser Elazab, Torben Gräber, Michael Unterreiner et al.
MOOSS: Mask-Enhanced Temporal Contrastive Learning for Smooth State Evolution in Visual Reinforcement Learning
Jiarui Sun, M. Ugur Akcal, Girish Chowdhary et al.
MoRAG - Multi-Fusion Retrieval Augmented Generation for Human Motion
Sai Shashank Kalakonda, Shubh Maheshwari, Ravi Kiran Sarvadevabhatla
MRI Reconstruction with Regularized 3D Diffusion Model (R3DM)
Arya Bangun, Zhuo Cao, Alessio Quercia et al.
MS-Glance: Bio-Inspired Non-Semantic Context Vectors and their Applications in Supervising Image Reconstruction
Ziqi Gao, Wendi Yang, Yujia Li et al.
MSI-NeRF: Linking Omni-Depth with View Synthesis through Multi-Sphere Image Aided Generalizable Neural Radiance Field
Dongyu Yan, Guanyu Huang, Fengyu Quan et al.
MulModSeg: Enhancing Unpaired Multi-Modal Medical Image Segmentation with Modality-Conditioned Text Embedding and Alternating Training
Chengyin Li, Hui Zhu, Rafi Ibn Sultan et al.
Multi-Aperture Transformers for 3D (MAT3D) Segmentation of Clinical and Microscopic Images
Muhammad Sohaib, Siyavash Shabani, Sahar A. Mohammed et al.
Multi-Class Textual-Inversion Secretly Yields a Semantic-Agnostic Classifier
Kai Wang, Fei Yang, Bogdan Raducanu et al.
Multi-HexPlanes: A Lightweight Map Representation for Rendering and 3D Reconstruction
Jianhao Zheng, Gábor Valasek, Daniel Barath et al.
Multi-Label Continual Learning for the Medical Domain: A Novel Benchmark
Marina Ceccon, Davide Dalle Pezze, Alessandro Fabris et al.
Multi-Level Feature Distillation of Joint Teachers Trained on Distinct Image Datasets
Adrian Iordache, Bogdan Alexe, Radu Tudor Ionescu
Multimodal Fusion Learning with Dual Attention for Medical Imaging
Joy Dhar, Nayyar Zaidi, Maryam Haghighat et al.
Multimodal Interpretable Depression Analysis using Visual Physiological Audio and Textual Data
Puneet Kumar, Shreshtha Misra, Zhuhong Shao et al.
Multi-Modal Large Language Models are Effective Vision Learners
Li Sun, Chaitanya Ahuja, Peng Chen et al.
Multi-Modal Large Language Model with RAG Strategies in Soccer Commentary Generation
Xiang Li, Yangfan He, Shuaishuai Zu et al.
Multi-Resolution Guided 3D GANs for Medical Image Translation
Juhyung Ha, Jong Sung Park, David Crandall et al.
Multi-Scale Grouped Prototypes for Interpretable Semantic Segmentation
Hugo Porta, Emanuele Dalsasso, Diego Marcos et al.
Multi-Spectral Image Color Reproduction
Jiacheng Li, Chang Chen, Xue Hu et al.
Multispectral Object Detection Enhanced by Cross-Modal Information Complementary and Cosine Similarity Channel Resampling Modules
Junbo Jang, Chanyeong Park, Heegwang Kim et al.