Artificial Intelligence › Core AI ›

Interpretability

7318 directly classified papers

Papers per year

Papers

Explainable Oracle Bone Script Recognition via Multimodal Pictographic Reasoning AAAI 2026

Judging by the Rules: Compliance-Aligned Framework for Modern Slavery Statement Monitoring AAAI 2026

On Trustworthy, Explainable, and Verifiable High-Level Autonomy via Hierarchical Planning AAAI 2026

Beyond Neuron-Level Sparsity: Achieving Faithful and Interpretable LLMs with Mixture of Decoders AAAI 2026

Physics-Informed Autonomous LLM Agents for Explainable Power Electronics Modulation Design AAAI 2026

A Metacognitive Architecture for Correcting LLM Errors in AI Agents AAAI 2026

EduMod-LLM: A Modular Approach for Designing Flexible and Transparent Educational Assistants AAAI 2026

Empowering LLMs with Symbolic Representation and Reasoning AAAI 2026

Always Refuse: Steering LLMs Against Jailbreaks with Contrastive Activations (Student Abstract) AAAI 2026

When Reasoning Collapses: A Depth-Aware Probe into LLM Reasoning (Student Abstract) AAAI 2026

iDT-diet: Toward Personalized Health Forecasting-An Intelligent Digital Twin Model for Diet-Influenced Biomarker Trajectories (Student Abstract) AAAI 2026

Improving CAPTCHA Robustness via Controlled Image Corruptions (Student Abstract) AAAI 2026

IntelliProof: An Argumentation Network-based Conversational Helper for Organized Reflection AAAI 2026

OMEGA: An Ontology-Driven Tool for Explaining Multi-Agent Path Finding AAAI 2026

Forget What You Know about LLMs Evaluations - LLMs are Like a Chameleon EMNLP 2025

Interpretation Meets Safety: A Survey on Interpretation Methods and Tools for Improving LLM Safety EMNLP 2025

Unsupervised Hallucination Detection by Inspecting Reasoning Processes EMNLP 2025

Explain It as Simple as Possible, but No Simpler – Explanation via Model Simplification for Addressing Inferential Gap (Abstract Reprint) IJCAI 2025

A Semantic Framework for Neurosymbolic Computation (Abstract Reprint) IJCAI 2025

“I’ve Decided to Leak”: Probing Internals Behind Prompt Leakage Intents EMNLP 2025

Follow the Flow: Fine-grained Flowchart Attribution with Neurosymbolic Agents EMNLP 2025

Learning Accurate and Interpretable Decision Trees (Extended Abstract) IJCAI 2025

Explainable Automatic Fact-Checking for Journalists Augmentation in the Wild IJCAI 2025

Understanding and Mitigating Overrefusal in LLMs from an Unveiling Perspective of Safety Decision Boundary EMNLP 2025

Circuit-Aware d-DNNF Compilation IJCAI 2025