Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Core AI
Artificial Intelligence
›
Core AI
›
Reasoning
2595 directly classified papers
Papers per year
2003: 2
2006: 2
2007: 6
2008: 4
2009: 4
2010: 7
2011: 7
2012: 10
2013: 17
2014: 18
2015: 14
2016: 9
2017: 42
2018: 62
2019: 121
2020: 131
2021: 187
2022: 280
2023: 299
2024: 537
2025: 832
2026: 4
Papers
TelAgentBench: A Multi-faceted Benchmark for Evaluating LLM-based Agents in Telecommunications
EMNLP 2025
U-MATH: A University-Level Benchmark for Evaluating Mathematical Skills in Large Language Models
ACL 2025
Temporal Information Retrieval via Time-Specifier Model Merging
ACL 2025
Can LLMs Recognize Their Own Analogical Hallucinations? Evaluating Uncertainty Estimation for Analogical Reasoning
ACL 2025
Theorem-of-Thought: A Multi-Agent Framework for Abductive, Deductive, and Inductive Reasoning in Language Models
ACL 2025
ArithmAttack: Evaluating Robustness of LLMs to Noisy Context in Math Problem Solving
ACL 2025
Does “Reasoning” with Large Language Models Improve Recognizing, Generating and Reframing Unhelpful Thoughts?
ACL 2025
The Art of Tool Interface Design
ACL 2025
ToolReflection: Improving Large Language Models for Real-World API Calls with Self-Generated Data
ACL 2025
Snap Out of It: A Dual-Process Approach to Mitigating Overthinking in Language Model Reasoning
ACL 2025
StateAct: Enhancing LLM Base Agents via Self-prompting and State-tracking
ACL 2025
The ClimateCheck Shared Task: Scientific Fact-Checking of Social Media Claims about Climate Change
ACL 2025
iai_MSU at SemEval-2025 Task-3: Mu-SHROOM, the Multilingual Shared-task on Hallucinations and Related Observable Overgeneration Mistakes in English
ACL 2025
UZH at SemEval-2025 Task 3: Token-Level Self-Consistency for Hallucination Detection
ACL 2025
NCL-UoR at SemEval-2025 Task 3: Detecting Multilingual Hallucination and Related Observable Overgeneration Text Spans with Modified RefChecker and Modified SeflCheckGPT
ACL 2025
Tables as Thought: Exploring Structured Thoughts in LLM Reasoning
ACL 2025
Sparks of Tabular Reasoning via Text2SQL Reinforcement Learning
ACL 2025
DiaDP@XLLM25: Advancing Chinese Dialogue Parsing via Unified Pretrained Language Models and Biaffine Dependency Scoring
ACL 2025
LLMSR@XLLM25: Less is More: Enhancing Structured Multi-Agent Reasoning via Quality-Guided Distillation
ACL 2025
R3-RAG: Learning Step-by-Step Reasoning and Retrieval for LLMs via Reinforcement Learning
EMNLP 2025
ModeLing: A Novel Dataset for Testing Linguistic Reasoning in Language Models
NAACL 2025
LATTE: Learning to Think with Vision Specialists
EMNLP 2025
From Causal Parrots to Causal Prophets? Towards Sound Causal Reasoning with Large Language Models
NAACL 2025
Can Prompts Rewind Time for LLMs? Evaluating the Effectiveness of Prompted Knowledge Cutoffs
EMNLP 2025
Beyond Image Classification: A Video Benchmark and Dual-Branch Hybrid Discrimination Framework for Compositional Zero-Shot Learning
CVPR 2025
<
1
…
14
15
16
…
104
>