conftrace
_
Papers
Trends
Conferences
Explore
Authors
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
← Resources & Methods
Natural Language Processing
›
Resources & Methods
›
Large Language Models
9,067 papers
Papers per year
2010: 1
2013: 1
2017: 1
2018: 14
2019: 129
2020: 336
2021: 463
2022: 582
2023: 1165
2024: 2492
2025: 3325
2026: 558
Papers
Argus: Vision-Centric Reasoning with Grounded Chain-of-Thought
CVPR 2025
COUNTS: Benchmarking Object Detectors and Multimodal Large Language Models under Distribution Shifts
CVPR 2025
Beyond Sight: Towards Cognitive Alignment in LVLM via Enriched Visual Knowledge
CVPR 2025
A Simple yet Effective Layout Token in Large Language Models for Document Understanding
CVPR 2025
Task-aware Cross-modal Feature Refinement Transformer with Large Language Models for Visual Grounding
CVPR 2025
StarVector: Generating Scalable Vector Graphics Code from Images and Text
CVPR 2025
LLaVA-ST: A Multimodal Large Language Model for Fine-Grained Spatial-Temporal Understanding
CVPR 2025
CoMM: A Coherent Interleaved Image-Text Dataset for Multimodal Understanding and Generation
CVPR 2025
ChatHuman: Chatting about 3D Humans with Tools
CVPR 2025
VideoGLaMM : A Large Multimodal Model for Pixel-Level Visual Grounding in Videos
CVPR 2025
MM-OR: A Large Multimodal Operating Room Dataset for Semantic Understanding of High-Intensity Surgical Environments
CVPR 2025
Video-XL: Extra-Long Vision Language Model for Hour-Scale Video Understanding
CVPR 2025
Online Video Understanding: OVBench and VideoChat-Online
CVPR 2025
SPA-VL: A Comprehensive Safety Preference Alignment Dataset for Vision Language Models
CVPR 2025
Omni-RGPT: Unifying Image and Video Region-level Understanding via Token Marks
CVPR 2025
Do We Really Need Curated Malicious Data for Safety Alignment in Multi-modal Large Language Models?
CVPR 2025
Patch Matters: Training-free Fine-grained Image Caption Enhancement via Local Perception
CVPR 2025
Scalable Video-to-Dataset Generation for Cross-Platform Mobile Agents
CVPR 2025
FastVLM: Efficient Vision Encoding for Vision Language Models
CVPR 2025
FLAME: Frozen Large Language Models Enable Data-Efficient Language-Image Pre-training
CVPR 2025
RAP: Retrieval-Augmented Personalization for Multimodal Large Language Models
CVPR 2025
LLMDet: Learning Strong Open-Vocabulary Object Detectors under the Supervision of Large Language Models
CVPR 2025
SynerGen-VL: Towards Synergistic Image Understanding and Generation with Vision Experts and Token Folding
CVPR 2025
Human Motion Instruction Tuning
CVPR 2025
Separation of Powers: On Segregating Knowledge from Observation in LLM-enabled Knowledge-based Visual Question Answering
CVPR 2025
<
1
…
91
92
93
…
363
>