Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Keywords
video captioning
206 papers
Explore in graph
Also known as
MCN
Co-occurring keywords
video understanding
(1647)
multimodal learning
(4622)
image captioning
(728)
recurrent neural network
(1790)
video description
(25)
action recognition
(957)
attention mechanism
(3975)
natural language generation
(782)
vision-language model
(2235)
contrastive learning
(3979)
Papers
Attend and Interact: Higher-Order Object Interactions for Video Understanding
CVPR 2018
Jointly Localizing and Describing Events for Dense Video Captioning
CVPR 2018
End-to-End Dense Video Captioning With Masked Transformer
CVPR 2018
TVT: Two-View Transformer Network for Video Captioning
ACML 2018
A Dataset for Telling the Stories of Social Media Videos
EMNLP 2018
Video Captioning with Tube Features
IJCAI 2018
StyleNet: Generating Attractive Visual Captions With Styles
CVPR 2017
Reinforced Video Captioning with Entailment Rewards
EMNLP 2017
MAM-RNN: Multi-level Attention Model Based RNN for Video Captioning
IJCAI 2017
Video Captioning With Transferred Semantic Attributes
CVPR 2017
Improving Interpretability of Deep Neural Networks With Semantic Information
CVPR 2017
Supervising Neural Attention Models for Video Captioning by Human Gaze Data
CVPR 2017
Video Highlight Prediction Using Audience Chat Reactions
EMNLP 2017
Semantic Compositional Networks for Visual Captioning
CVPR 2017
Procedural Text Generation from an Execution Video
IJCNLP 2017
End-To-End Concept Word Detection for Video Captioning, Retrieval, and Question Answering
CVPR 2017
Hierarchical Boundary-Aware Neural Encoder for Video Captioning
CVPR 2017
Task-Driven Dynamic Fusion: Reducing Ambiguity in Video Description
CVPR 2017
Hierarchical LSTM with Adjusted Temporal Attention for Video Captioning
IJCAI 2017
Multi-Task Video Captioning with Video and Entailment Generation
ACL 2017
Generating Video Description using Sequence-to-sequence Model with Temporal Attention
COLING 2016
MSR-VTT: A Large Video Description Dataset for Bridging Video and Language
CVPR 2016
TGIF: A New Dataset and Benchmark on Animated GIF Description
CVPR 2016
Jointly Modeling Embedding and Translation to Bridge Video and Language
CVPR 2016
Hierarchical Recurrent Neural Encoder for Video Representation With Application to Captioning
CVPR 2016
<
1
…
5
6
7
8
9
>