Co-occurring keywords
Papers
U-MATH: A University-Level Benchmark for Evaluating Mathematical Skills in Large Language Models
ACL 2025
Parrot: A Training Pipeline Enhances Both Program CoT and Natural Language CoT for Reasoning
EMNLP 2025
Step Guided Reasoning: Improving Mathematical Reasoning using Guidance Generation and Step Reasoning
EMNLP 2025