conftrace_

Alejandro Maté

3 papers · 2024–2025 · 3 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓

🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🌍 Conference Polyglot (3) 🐝 Cross-Pollinator (5) ❓ The Questioner

Conferences

AAAI (1) AISTATS (1) IJCAI (1)

Top co-authors

Jorge García-Carrasco (3) Juan Trujillo (2) Juan Carlos Trujillo (1)

Keywords

mechanistic interpretability (3) large language model (2) adversarial attack (1) circuit analysis (1) attention head (1) vulnerability detection (1) transformer language model (1) multiple-token prediction (1) causal mask mechanism (1) multi-token prediction (1) positional information (1) task-specific model (1) model compression (1) circuit extraction (1) knowledge distillation (1)

Papers

Extracting Interpretable Task-Specific Circuits from Large Language Models for Faster Inference AAAI 2025 How does GPT-2 Predict Acronyms? Extracting and Understanding a Circuit via Mechanistic Interpretability AISTATS 2024 Detecting and Understanding Vulnerabilities in Language Models via Mechanistic Interpretability IJCAI 2024