Nicholas Carlini
41 papers · 2018–2025 · 5 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+13 more ↓ Show less ↑
π Conference Polyglot (5) π Academic Marathon (7) π§ Keyword Pioneer π Interdisciplinary Bridge π Cross-Pollinator (10)
π§
Keyword Pioneer
π
Renaissance Researcher
(7)
π
Conference Polyglot
(5)
π€
Dynamic Duo
(22)
π
Triple Crown
π₯
Mega-Team
(34)
π¬
Deep Specialist
(13)
π
Keyword Champion
(4)
ποΈ
Keyword Collector
(88)
β‘
Prolific Year
(9)
β
The Questioner
π₯
Unstoppable
(8)
π
Century Club
(41)
Conferences
ICLR (14)
NIPS (14)
ICML (11)
ACL (1)
CVPR (1)
Top co-authors
Research topics
Keywords
adversarial example
(10)
adversarial robustness
(5)
membership inference
(4)
language model
(3)
model robustness
(3)
adversarial perturbation
(3)
differential privacy
(3)
adversarial attack
(3)
privacy attack
(3)
semi-supervised learning
(2)
image classification
(2)
model poisoning
(2)
training datum
(2)
robustness evaluation
(2)
domain generalization
(2)
adversarial training
(2)
backdoor attack
(2)
data augmentation
(2)
distribution shift
(2)
query-based attack
(2)
Papers
Position: In-House Evaluation Is Not Enough. Towards Robust Third-Party Evaluation and Flaw Disclosure for General-Purpose AI
ICML 2025
Exploring and Mitigating Adversarial Manipulation of Voting-Based Leaderboards
ICML 2025
AutoAdvExBench: Benchmarking Autonomous Exploitation of Adversarial Example Defenses
ICML 2025
Scalable Extraction of Training Data from Aligned, Production Language Models
ICLR 2025
Adversarial Perturbations Cannot Reliably Protect Artists From Generative AI
ICLR 2025
Measuring Non-Adversarial Reproduction of Training Data in Large Language Models
ICLR 2025
On Evaluating the Durability of Safeguards for Open-Weight LLMs
ICLR 2025
Persistent Pre-training Poisoning of LLMs
ICLR 2025
Privacy Backdoors: Enhancing Membership Inference through Poisoning Pre-trained Models
NIPS 2024
Initialization Matters for Adversarial Transfer Learning
CVPR 2024
Stealing part of a production language model
ICML 2024
Position: Considerations for Differentially Private Learning with Large-Scale Public Pretraining
ICML 2024
Query-Based Adversarial Prompt Generation
NIPS 2024
Preprocessors Matter! Realistic Decision-Based Attacks on Machine Learning Systems
ICML 2023
Counterfactual Memorization in Neural Language Models
NIPS 2023
Students Parrot Their Teachers: Membership Inference on Model Distillation
NIPS 2023
Are aligned neural networks adversarially aligned?
NIPS 2023
Effective Robustness against Natural Distribution Shifts for Models with Different Training Data
NIPS 2023
(Certified!!) Adversarial Robustness for Free!
ICLR 2023
Measuring Forgetting of Memorized Training Examples
ICLR 2023
Part-Based Models Improve Adversarial Robustness
ICLR 2023
Quantifying Memorization Across Neural Language Models
ICLR 2023
Poisoning and Backdooring Contrastive Learning
ICLR 2022
AdaMatch: A Unified Approach to Semi-Supervised Learning and Domain Adaptation
ICLR 2022
Data Poisoning Wonβt Save You From Facial Recognition
ICLR 2022
Evading Adversarial Example Detection Defenses with Orthogonal Projected Gradient Descent
ICLR 2022
Increasing Confidence in Adversarial Robustness Evaluations
NIPS 2022
Deduplicating Training Data Makes Language Models Better
ACL 2022
Indicators of Attack Failure: Debugging and Improving Optimization of Adversarial Examples
NIPS 2022
The Privacy Onion Effect: Memorization is Relative
NIPS 2022
Handcrafted Backdoors in Deep Neural Networks
NIPS 2022
Label-Only Membership Inference Attacks
ICML 2021
Measuring Robustness to Natural Distribution Shifts in Image Classification
NIPS 2020
On Adaptive Attacks to Adversarial Example Defenses
NIPS 2020
FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence
NIPS 2020
ReMixMatch: Semi-Supervised Learning with Distribution Matching and Augmentation Anchoring
ICLR 2020
Fundamental Tradeoffs between Invariance and Sensitivity to Adversarial Perturbations
ICML 2020
MixMatch: A Holistic Approach to Semi-Supervised Learning
NIPS 2019
Imperceptible, Robust, and Targeted Adversarial Examples for Automatic Speech Recognition
ICML 2019
Adversarial Examples Are a Natural Consequence of Test Error in Noise
ICML 2019
Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples
ICML 2018