Javier Rando
9 papers · 2024–2026 · 5 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+2 more ↓ Show less ↑
π Interdisciplinary Bridge π Conference Polyglot (4) π£ Hot Topic Early Bird π Cross-Pollinator (15) π§ Keyword Pioneer
π₯
Mega-Team
(21)
β‘
Prolific Year
(5)
Conferences
ICLR (5)
ACL (1)
EMNLP (1)
ICML (1)
NIPS (1)
Top co-authors
Keywords
large language model
(2)
hierarchical structure
(1)
adversarial attack
(1)
language model
(1)
security evaluation
(1)
prompt injection
(1)
defense mechanism
(1)
multilingual language model
(1)
security vulnerability
(1)
probing classifier
(1)
representation probing
(1)
pretraining datum
(1)
truthfulness modeling
(1)
persona hypothesis
(1)
probing accuracy
(1)
model defense
(1)
responsible artificial intelligence
(1)
data compliance
(1)
representation learning
(1)
goldfish objective
(1)
Papers
Apertus: Democratizing Open and Compliant LLMs for Global Language Environments
ACL 2026
Scalable Extraction of Training Data from Aligned, Production Language Models
ICLR 2025
Measuring Non-Adversarial Reproduction of Training Data in Large Language Models
ICLR 2025
Persistent Pre-training Poisoning of LLMs
ICLR 2025
AutoAdvExBench: Benchmarking Autonomous Exploitation of Adversarial Example Defenses
ICML 2025
Adversarial Perturbations Cannot Reliably Protect Artists From Generative AI
ICLR 2025
Universal Jailbreak Backdoors from Poisoned Human Feedback
ICLR 2024
Personas as a Way to Model Truthfulness in Language Models
EMNLP 2024
Dataset and Lessons Learned from the 2024 SaTML LLM Capture-the-Flag Competition
NIPS 2024