Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Keywords
video captioning
206 papers
Explore in graph
Also known as
MCN
Co-occurring keywords
video understanding
(1647)
multimodal learning
(4622)
image captioning
(728)
recurrent neural network
(1790)
video description
(25)
action recognition
(957)
attention mechanism
(3975)
natural language generation
(782)
vision-language model
(2235)
contrastive learning
(3979)
Papers
Vript: A Video Is Worth Thousands of Words
NIPS 2024
OW-VISCapTor: Abstractors for Open-World Video Instance Segmentation and Captioning
NIPS 2024
MM-Narrator: Narrating Long-form Videos with Multimodal In-Context Learning
CVPR 2024
Previously on ... From Recaps to Story Summarization
CVPR 2024
HourVideo: 1-Hour Video-Language Understanding
NIPS 2024
CALVIN: Improved Contextual Video Captioning via Instruction Tuning
NIPS 2024
UNICORN: A Unified Causal Video-Oriented Language-Modeling Framework for Temporal Video-Language Tasks
EMNLP 2024
Retrieval-Augmented Egocentric Video Captioning
CVPR 2024
VideoLLM-online: Online Video Large Language Model for Streaming Video
CVPR 2024
Distilling Vision-Language Models on Millions of Videos
CVPR 2024
MovieChat: From Dense Token to Sparse Memory for Long Video Understanding
CVPR 2024
TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding
CVPR 2024
DeVAn: Dense Video Annotation for Video-Language Models
ACL 2024
VERIFIED: A Video Corpus Moment Retrieval Benchmark for Fine-Grained Video Understanding
NIPS 2024
Streaming Dense Video Captioning
CVPR 2024
Comprehensive Visual Grounding for Video Description
AAAI 2024
Stitching Segments and Sentences towards Generalization in Video-Text Pre-training
AAAI 2024
Abstractive Multi-Video Captioning: Benchmark Dataset Construction and Extensive Evaluation
COLING 2024
Unveiling the Invisible: Captioning Videos with Metaphors
EMNLP 2024
AutoAD III: The Prequel - Back to the Pixels
CVPR 2024
Set Prediction Guided by Semantic Concepts for Diverse Video Captioning
AAAI 2024
Text With Knowledge Graph Augmented Transformer for Video Captioning
CVPR 2023
Exploring Group Video Captioning with Efficient Relational Approximation
ICCV 2023
Hierarchical Video-Moment Retrieval and Step-Captioning
CVPR 2023
Prompt Switch: Efficient CLIP Adaptation for Text-Video Retrieval
ICCV 2023
<
1
2
3
4
5
…
9
>