Co-occurring keywords
Papers
AfriMTEB and AfriE5: Benchmarking and Adapting Text Embedding Models for African Languages
EACL 2026
Taxation Perspectives from Large Language Models: A Case Study on Additional Tax Penalties
EACL 2026
SCENEBench: An Audio Understanding Benchmark Grounded in Assistive and Industrial Use Cases
EACL 2026
Do You See Me : A Multidimensional Benchmark for Evaluating Visual Perception in Multimodal LLMs
EACL 2026
BOP-Distrib: Revisiting 6D Pose Estimation Benchmarks for Better Evaluation under Visual Ambiguities
WACV 2026
HumanBench: Two Heads, No Legs, But Mostly Human, the State of Generative Capabilities in T2I Models
WACV 2026
When Can We Trust LLMs in Mental Health? Large-Scale Benchmarks for Reliable LLM Evaluation
EACL 2026