Co-occurring keywords
Papers
Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models
EMNLP 2024
ReadMe++: Benchmarking Multilingual Language Models for Multi-Domain Readability Assessment
EMNLP 2024
Stratified Prediction-Powered Inference for Effective Hybrid Evaluation of Language Models
NIPS 2024
SimLex-999 for Dutch
COLING 2024
Efficient Benchmarking (of Language Models)
NAACL 2024