conftrace
_
Papers
Trends
Conferences
Explore
Authors
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
← Core AI
Artificial Intelligence
›
Core AI
›
Large Language Models
6,405 papers
Papers per year
2007: 3
2017: 2
2018: 3
2019: 10
2020: 49
2021: 53
2022: 188
2023: 558
2024: 1910
2025: 3619
2026: 10
Papers
MG-MotionLLM: A Unified Framework for Motion Comprehension and Generation across Multiple Granularities
CVPR 2025
StoryGPT-V: Large Language Models as Consistent Story Visualizers
CVPR 2025
Beyond Sight: Towards Cognitive Alignment in LVLM via Enriched Visual Knowledge
CVPR 2025
HEIE: MLLM-Based Hierarchical Explainable AIGC Image Implausibility Evaluator
CVPR 2025
StarVector: Generating Scalable Vector Graphics Code from Images and Text
CVPR 2025
OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations
CVPR 2025
ChatHuman: Chatting about 3D Humans with Tools
CVPR 2025
ComfyBench: Benchmarking LLM-based Agents in ComfyUI for Autonomously Designing Collaborative AI Systems
CVPR 2025
Reconstructing Animals and the Wild
CVPR 2025
Video-XL: Extra-Long Vision Language Model for Hour-Scale Video Understanding
CVPR 2025
Online Video Understanding: OVBench and VideoChat-Online
CVPR 2025
Scene Map-based Prompt Tuning for Navigation Instruction Generation
CVPR 2025
Omni-RGPT: Unifying Image and Video Region-level Understanding via Token Marks
CVPR 2025
Do We Really Need Curated Malicious Data for Safety Alignment in Multi-modal Large Language Models?
CVPR 2025
PosterO: Structuring Layout Trees to Enable Language Models in Generalized Content-Aware Layout Generation
CVPR 2025
SKE-Layout: Spatial Knowledge Enhanced Layout Generation with LLMs
CVPR 2025
FastVLM: Efficient Vision Encoding for Vision Language Models
CVPR 2025
Exploring the Deep Fusion of Large Language Models and Diffusion Transformers for Text-to-Image Synthesis
CVPR 2025
HalLoc: Token-level Localization of Hallucinations for Vision Language Models
CVPR 2025
LLMDet: Learning Strong Open-Vocabulary Object Detectors under the Supervision of Large Language Models
CVPR 2025
Human-centered Interactive Learning via MLLMs for Text-to-Image Person Re-identification
CVPR 2025
Provoking Multi-modal Few-Shot LVLM via Exploration-Exploitation In-Context Learning
CVPR 2025
BadToken: Token-level Backdoor Attacks to Multi-modal Large Language Models
CVPR 2025
CL-MoE: Enhancing Multimodal Large Language Model with Dual Momentum Mixture-of-Experts for Continual Visual Question Answering
CVPR 2025
SF2T: Self-supervised Fragment Finetuning of Video-LLMs for Fine-Grained Understanding
CVPR 2025
<
1
…
71
72
73
…
257
>