conftrace_

Artificial Intelligence › Core AI ›

Interpretability

7,318 papers

Papers per year

Papers

Localize, Understand, Collaborate: Semantic-Aware Dragging via Intention Reasoner NIPS 2024

LeDex: Training LLMs to Better Self-Debug and Explain Code NIPS 2024

Interpreting Learned Feedback Patterns in Large Language Models NIPS 2024

Almost-Linear RNNs Yield Highly Interpretable Symbolic Codes in Dynamical Systems Reconstruction NIPS 2024

Learning Discrete Concepts in Latent Hierarchical Models NIPS 2024

Diff-eRank: A Novel Rank-Based Metric for Evaluating Large Language Models NIPS 2024

LG-CAV: Train Any Concept Activation Vector with Language Guidance NIPS 2024

Interpreting and Analysing CLIP's Zero-Shot Image Classification via Mutual Knowledge NIPS 2024

GAIA: Rethinking Action Quality Assessment for AI-Generated Videos NIPS 2024

Interpretable Image Classification with Adaptive Prototype-based Vision Transformers NIPS 2024

LACIE: Listener-Aware Finetuning for Calibration in Large Language Models NIPS 2024

Beyond Accuracy: Ensuring Correct Predictions With Correct Rationales NIPS 2024

Multi-Object Hallucination in Vision Language Models NIPS 2024

Data-faithful Feature Attribution: Mitigating Unobservable Confounders via Instrumental Variables NIPS 2024

Biologically Inspired Learning Model for Instructed Vision NIPS 2024

Flaws can be Applause: Unleashing Potential of Segmenting Ambiguous Objects in SAM NIPS 2024

Questioning the Survey Responses of Large Language Models NIPS 2024

Learning to Understand: Identifying Interactions via the Möbius Transform NIPS 2024

MARVEL: Multidimensional Abstraction and Reasoning through Visual Evaluation and Learning NIPS 2024

Task Confusion and Catastrophic Forgetting in Class-Incremental Learning: A Mathematical Framework for Discriminative and Generative Modelings NIPS 2024

Measuring Per-Unit Interpretability at Scale Without Humans NIPS 2024

Decoding-Time Language Model Alignment with Multiple Objectives NIPS 2024

SETLEXSEM CHALLENGE: Using Set Operations to Evaluate the Lexical and Semantic Robustness of Language Models NIPS 2024

Towards the Dynamics of a DNN Learning Symbolic Interactions NIPS 2024

Benchmarking Uncertainty Disentanglement: Specialized Uncertainties for Specialized Tasks NIPS 2024