Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Learning Types
Machine Learning
›
Learning Types
›
Evaluation
1654 directly classified papers
Papers per year
2005: 1
2006: 1
2007: 1
2008: 2
2009: 1
2010: 3
2011: 2
2012: 3
2013: 5
2014: 4
2015: 4
2016: 11
2017: 19
2018: 32
2019: 39
2020: 72
2021: 110
2022: 202
2023: 222
2024: 351
2025: 569
Papers
What Makes Reading Comprehension Questions Difficult?
ACL 2022
Nibbling at the Hard Core of Word Sense Disambiguation
ACL 2022
SRL4E – Semantic Role Labeling for Emotions: A Unified Evaluation Framework
ACL 2022
DiBiMT: A Novel Benchmark for Measuring Word Sense Disambiguation Biases in Machine Translation
ACL 2022
LexGLUE: A Benchmark Dataset for Legal Language Understanding in English
ACL 2022
ILDAE: Instance-Level Difficulty Analysis of Evaluation Data
ACL 2022
TruthfulQA: Measuring How Models Mimic Human Falsehoods
ACL 2022
InteractEva: A Simulation-Based Evaluation Framework for Interactive AI Systems
AAAI 2022
MDD-Eval: Self-Training on Augmented Data for Multi-Domain Dialogue Evaluation
AAAI 2022
PUMA: Performance Unchanged Model Augmentation for Training Data Removal
AAAI 2022
On the Use of Unrealistic Predictions in Hundreds of Papers Evaluating Graph Representations
AAAI 2022
Towards a Rigorous Evaluation of Time-Series Anomaly Detection
AAAI 2022
Iteratively Selecting an Easy Reference Frame Makes Unsupervised Video Object Segmentation Easier
AAAI 2022
Which Model is Best: Comparing Methods and Metrics for Automatic Laughter Detection in a Naturalistic Conversational Dataset
INTERSPEECH 2022
Evaluating User Perception of Speech Recognition System Quality with Semantic Distance Metric
INTERSPEECH 2022
Qualitative Evaluation of Language Model Rescoring in Automatic Speech Recognition
INTERSPEECH 2022
Effects of Noise on Speech Perception and Spoken Word Comprehension
INTERSPEECH 2022
Evaluating the effects of modified speech on perceptual speaker identification performance
INTERSPEECH 2022
Are reported accuracies in the clinical speech machine learning literature overoptimistic?
INTERSPEECH 2022
Predicting label distribution improves non-intrusive speech quality estimation
INTERSPEECH 2022
Not a Number: Identifying Instance Features for Capability-Oriented Evaluation
IJCAI 2022
When AUC meets DRO: Optimizing Partial AUC for Deep Learning with Non-Convex Convergence Guarantee
ICML 2022
Data-SUITE: Data-centric identification of in-distribution incongruous examples
ICML 2022
Stable Conformal Prediction Sets
ICML 2022
Re-Examining System-Level Correlations of Automatic Summarization Evaluation Metrics
NAACL 2022
<
1
…
48
49
50
…
67
>