Co-occurring keywords
Papers
PlanningArena: A Modular Benchmark for Multidimensional Evaluation of Planning and Tool Learning
ACL 2025
LTRAG: Enhancing Autoformalization and Self-refinement for Logical Reasoning with Thought-Guided RAG
ACL 2025
Enhancing Complex Reasoning in Knowledge Graph Question Answering through Query Graph Approximation
ACL 2025
DivLogicEval: A Framework for Benchmarking Logical Reasoning Evaluation in Large Language Models
EMNLP 2025
Semantic Inversion, Identical Replies: Revisiting Negation Blindness in Large Language Models
EMNLP 2025