conftrace_

Artificial Intelligence › Core AI ›

Interpretability

7,318 papers

Papers per year

Papers

Concept-based Explanations for Out-of-Distribution Detectors ICML 2023

A Toy Model of Universality: Reverse Engineering how Networks Learn Group Operations ICML 2023

K-SHAP: Policy Clustering Algorithm for Anonymous Multi-Agent State-Action Pairs ICML 2023

Meta-Learning the Inductive Bias of Simple Neural Circuits ICML 2023

Learning Perturbations to Explain Time Series Predictions ICML 2023

Parallel Neurosymbolic Integration with Concordia ICML 2023

Explainable Data-Driven Optimization: From Context to Decision and Back Again ICML 2023

Towards Reliable Neural Specifications ICML 2023

Dividing and Conquering a BlackBox to a Mixture of Interpretable Models: Route, Interpret, Repeat ICML 2023

Looped Transformers as Programmable Computers ICML 2023

Kernel Logistic Regression Approximation of an Understandable ReLU Neural Network ICML 2023

Conformal Prediction Sets for Graph Neural Networks ICML 2023

Robust Counterfactual Explanations for Neural Networks With Probabilistic Guarantees ICML 2023

On the Impact of Knowledge Distillation for Model Interpretability ICML 2023

Generalized Teacher Forcing for Learning Chaotic Dynamics ICML 2023

Decoding Layer Saliency in Language Transformers ICML 2023

Detecting Out-of-distribution Data through In-distribution Class Prior ICML 2023

R-U-SURE? Uncertainty-Aware Code Suggestions By Maximizing Utility Across Random User Intents ICML 2023

Automatically Auditing Large Language Models via Discrete Optimization ICML 2023

Identifying Interpretable Subspaces in Image Representations ICML 2023

On the Relationship Between Explanation and Prediction: A Causal View ICML 2023

PAC Prediction Sets for Large Language Models of Code ICML 2023

Trainability, Expressivity and Interpretability in Gated Neural ODEs ICML 2023

Towards Bridging the Gaps between the Right to Explanation and the Right to be Forgotten ICML 2023

Data-OOB: Out-of-bag Estimate as a Simple and Efficient Data Value ICML 2023