Co-occurring keywords
Papers
Leveraging Large Models to Evaluate Novel Content: A Case Study on Advertisement Creativity
EMNLP 2025
Enhancing Table Recognition with Vision LLMs: A Benchmark and Neighbor-Guided Toolchain Reasoner
IJCAI 2025
LLM Agents Making Agent Tools
ACL 2025
U-MATH: A University-Level Benchmark for Evaluating Mathematical Skills in Large Language Models
ACL 2025