← Learning Types

Machine Learning › Learning Types ›

Evaluation

1654 directly classified papers

Papers per year

Papers

Can Edge Probing Tests Reveal Linguistic Knowledge in QA Models? COLING 2022

Stability of Syntactic Dialect Classification over Space and Time COLING 2022

Deepchecks: A Library for Testing and Validating Machine Learning Models and Data JMLR 2022

On Structured Filtering-Clustering: Global Error Bound and Optimal First-Order Algorithms AISTATS 2022

An Empirical Study of Pipeline vs. Joint approaches to Entity and Relation Extraction IJCNLP 2022

Assessing Combinational Generalization of Language Models in Biased Scenarios IJCNLP 2022

Identifying Weaknesses in Machine Translation Metrics Through Minimum Bayes Risk Decoding: A Case Study for COMET IJCNLP 2022

Not another Negation Benchmark: The NaN-NLI Test Suite for Sub-clausal Negation IJCNLP 2022

ACES: Translation Accuracy Challenge Sets for Evaluating Machine Translation Metrics EMNLP 2022

Automated Evaluation Metric for Terminology Consistency in MT EMNLP 2022

Continuous Rating as Reliable Human Evaluation of Simultaneous Speech Translation EMNLP 2022

Findings of the WMT 2022 Shared Task on Quality Estimation EMNLP 2022

Exploring The Landscape of Distributional Robustness for Question Answering Models EMNLP 2022

Are Neural Topic Models Broken? EMNLP 2022

Sarcasm Detection is Way Too Easy! An Empirical Comparison of Human and Machine Sarcasm Detection EMNLP 2022

EnDex: Evaluation of Dialogue Engagingness at Scale EMNLP 2022

On the Impact of Temporal Concept Drift on Model Explanations EMNLP 2022

Language Prior Is Not the Only Shortcut: A Benchmark for Shortcut Learning in VQA EMNLP 2022

Simple but Challenging: Natural Language Inference Models Fail on Simple Sentences EMNLP 2022

Scientific and Creative Analogies in Pretrained Language Models EMNLP 2022

A Few More Examples May Be Worth Billions of Parameters EMNLP 2022

Measuring and Improving Semantic Diversity of Dialogue Generation EMNLP 2022

Language Models Are Poor Learners of Directional Inference EMNLP 2022

Impact of Pretraining Term Frequencies on Few-Shot Numerical Reasoning EMNLP 2022

Machine translation impact in E-commerce multilingual search EMNLP 2022