Co-occurring keywords
Papers
Touchstone Benchmark: Are We on the Right Way for Evaluating AI Algorithms for Medical Segmentation?
NIPS 2024
A Benchmark Suite for Evaluating Neural Mutual Information Estimators on Unstructured Datasets
NIPS 2024
OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
NIPS 2024