Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Generation
Computer Vision
›
Generation
›
Image Captioning
781 directly classified papers
Papers per year
2003: 1
2008: 1
2011: 1
2012: 1
2013: 5
2014: 2
2015: 21
2016: 17
2017: 36
2018: 47
2019: 92
2020: 73
2021: 96
2022: 91
2023: 107
2024: 86
2025: 96
2026: 8
Papers
Connecting Vision and Language With Video Localized Narratives
CVPR 2023
Interactive and Explainable Region-Guided Radiology Report Generation
CVPR 2023
Pay Attention to Implicit Attribute Values: A Multi-modal Generative Framework for AVE Task
ACL 2023
End-to-End 3D Dense Captioning With Vote2Cap-DETR
CVPR 2023
Learning To Dub Movies via Hierarchical Prosody Models
CVPR 2023
3D Change Localization and Captioning From Dynamic Scans of Indoor Scenes
WACV 2023
Dual Video Summarization: From Frames to Captions
IJCAI 2023
Positive-Augmented Contrastive Learning for Image and Video Captioning Evaluation
CVPR 2023
Text With Knowledge Graph Augmented Transformer for Video Captioning
CVPR 2023
REVEAL: Retrieval-Augmented Visual-Language Pre-Training With Multi-Source Multimodal Knowledge Memory
CVPR 2023
GLUECons: A Generic Benchmark for Learning under Constraints
AAAI 2023
Evidential Interactive Learning for Medical Image Captioning
ICML 2023
A-Cap: Anticipation Captioning With Commonsense Knowledge
CVPR 2023
Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval?
CVPR 2023
Model-Agnostic Gender Debiased Image Captioning
CVPR 2023
JaSPICE: Automatic Evaluation Metric Using Predicate-Argument Structures for Image Captioning Models
CONLL 2023
Aesthetically Relevant Image Captioning
AAAI 2023
Position-Guided Text Prompt for Vision-Language Pre-Training
CVPR 2023
JourneyDB: A Benchmark for Generative Image Understanding
NIPS 2023
DiffusEmp: A Diffusion Model-Based Framework with Multi-Grained Control for Empathetic Response Generation
ACL 2023
Crossing the Gap: Domain Generalization for Image Captioning
CVPR 2023
InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning
NIPS 2023
LaTr: Layout-Aware Transformer for Scene-Text VQA
CVPR 2022
Topic-aware Multimodal Summarization
AACL 2022
On Advances in Text Generation from Images Beyond Captioning: A Case Study in Self-Rationalization
EMNLP 2022
<
1
…
11
12
13
…
32
>