Co-occurring keywords
Papers
II-Bench: An Image Implication Understanding Benchmark for Multimodal Large Language Models
NIPS 2024
LHMKE: A Large-scale Holistic Multi-subject Knowledge Evaluation Benchmark for Chinese Large Language Models
COLING 2024
Understanding the Effects of Noise in Text-to-SQL: An Examination of the BIRD-Bench Benchmark
ACL 2024