← Learning Types

Machine Learning › Learning Types ›

Evaluation

1654 directly classified papers

Papers per year

Papers

Digging Errors in NMT: Evaluating and Understanding Model Errors from Partial Hypothesis Space EMNLP 2022

Estimating Example Difficulty Using Variance of Gradients CVPR 2022

Measuring Compositional Consistency for Video Question Answering CVPR 2022

Towards Driving-Oriented Metric for Lane Detection Models CVPR 2022

Do Explanations Explain? Model Knows Best CVPR 2022

OoD-Bench: Quantifying and Understanding Two Dimensions of Out-of-Distribution Generalization CVPR 2022

Texture-Based Error Analysis for Image Super-Resolution CVPR 2022

Bipartite-play Dialogue Collection for Practical Automatic Evaluation of Dialogue Systems AACL 2022

IM2: an Interpretable and Multi-category Integrated Metric Framework for Automatic Dialogue Evaluation EMNLP 2022

Questioning the Validity of Summarization Datasets and Improving Their Factual Consistency EMNLP 2022

Analyzing and Evaluating Faithfulness in Dialogue Summarization EMNLP 2022

FineD-Eval: Fine-grained Automatic Dialogue-Level Evaluation EMNLP 2022

A Multifaceted Framework to Evaluate Evasion, Content Preservation, and Misattribution in Authorship Obfuscation Techniques EMNLP 2022

Towards a Unified Multi-Dimensional Evaluator for Text Generation EMNLP 2022

EvEntS ReaLM: Event Reasoning of Entity States via Language Models EMNLP 2022

Geographic Citation Gaps in NLP Research EMNLP 2022

How Large Language Models are Transforming Machine-Paraphrase Plagiarism EMNLP 2022

QRelScore: Better Evaluating Generated Questions with Deeper Understanding of Context-aware Relevance EMNLP 2022

Exploring the Effects of Negation and Grammatical Tense on Bias Probes AACL 2022

YASO: A Targeted Sentiment Analysis Evaluation Dataset for Open-Domain Reviews EMNLP 2021

Chinese WPLC: A Chinese Dataset for Evaluating Pretrained Language Models on Word Prediction Given Long-Range Context EMNLP 2021

Finding a Balanced Degree of Automation for Summary Evaluation EMNLP 2021

How much coffee was consumed during EMNLP 2019? Fermi Problems: A New Reasoning Challenge for AI EMNLP 2021

Compression, Transduction, and Creation: A Unified Framework for Evaluating Natural Language Generation EMNLP 2021

Proxy Indicators for the Quality of Open-domain Dialogues EMNLP 2021