benchmark evaluation

1539 papers

Explore in graph

Also known as

MT-BENCH BDC

Co-occurring keywords

large language model (12755) question answering (2904) multimodal learning (4622) language model (4573) multimodal large language model (865) vision-language model (2235) visual question answering (1000) evaluation benchmark (250) multilingual nlp (1423) benchmark dataset (619)

Papers

How much progress have we made on RST discourse parsing? A replication study of recent results on the RST-DT EMNLP 2017

Level Playing Field for Million Scale Face Recognition CVPR 2017

HPatches: A Benchmark and Evaluation of Handcrafted and Learned Local Descriptors CVPR 2017

Efficient Benchmarking of NLP APIs using Multi-armed Bandits EACL 2017

pke: an open source python-based keyphrase extraction toolkit COLING 2016

The MegaFace Benchmark: 1 Million Faces for Recognition at Scale CVPR 2016

Traditional Saliency Reloaded: A Good Old Model in New Shape CVPR 2015

Boosting Object Proposals: From Pascal to COCO ICCV 2015

Large Scale Multi-view Stereopsis Evaluation CVPR 2014

The Secrets of Salient Object Segmentation CVPR 2014

Online Object Tracking: A Benchmark CVPR 2013

Boundary Detection Benchmarking: Beyond F-Measures CVPR 2013

Near-Maximum Entropy Models for Binary Neural Representations of Natural Images NIPS 2007

An Extensive Empirical Study of Feature Selection Metrics for Text Classification JMLR 2003