conftrace_

Javier Rando

9 papers · 2024–2026 · 5 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓

+2 more ↓

🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (4) 🐣 Hot Topic Early Bird 🐝 Cross-Pollinator (15) 🧭 Keyword Pioneer

👥 Mega-Team (21) ⚡ Prolific Year (5)

Conferences

ICLR (5) ACL (1) EMNLP (1) ICML (1) NIPS (1)

Top co-authors

Florian Tramer (8) Nicholas Carlini (5) Edoardo Debenedetti (3) Daphne Ippolito (3) Michael Aerni (2) Milad Nasr (2) Christopher A. Choquette-Choo (1) Clément Charmillot (1) Robin Schmid (1) Kaustubh Ponkshe (1)

Keywords

large language model (2) hierarchical structure (1) adversarial attack (1) language model (1) security evaluation (1) prompt injection (1) defense mechanism (1) multilingual language model (1) security vulnerability (1) probing classifier (1) representation probing (1) pretraining datum (1) truthfulness modeling (1) persona hypothesis (1) probing accuracy (1) model defense (1) responsible artificial intelligence (1) data compliance (1) representation learning (1) goldfish objective (1)

Papers

Apertus: Democratizing Open and Compliant LLMs for Global Language Environments ACL 2026 Scalable Extraction of Training Data from Aligned, Production Language Models ICLR 2025 Measuring Non-Adversarial Reproduction of Training Data in Large Language Models ICLR 2025 Persistent Pre-training Poisoning of LLMs ICLR 2025 AutoAdvExBench: Benchmarking Autonomous Exploitation of Adversarial Example Defenses ICML 2025 Adversarial Perturbations Cannot Reliably Protect Artists From Generative AI ICLR 2025 Universal Jailbreak Backdoors from Poisoned Human Feedback ICLR 2024 Personas as a Way to Model Truthfulness in Language Models EMNLP 2024 Dataset and Lessons Learned from the 2024 SaTML LLM Capture-the-Flag Competition NIPS 2024