Tag
#adversarial-ml
7 posts tagged adversarial-ml.
- attacks
Data Poisoning and Backdoor Attacks on Foundation Models
Training data manipulation, backdoor triggers, and Trojan attacks against large-scale models. What the threat model actually requires and where the defenses are in 2026.
- attacks
Evasion Attacks on Image Classifiers: FGSM, PGD, and C&W
The three foundational gradient-based evasion attacks, what each one actually optimizes, and what the benchmark numbers mean when you're evaluating a defense.
- attacks
Adversarial Robustness in NLP: Why Text Attacks Are Different
Discrete input spaces, semantic constraints, and human-perceptibility rules change what counts as an adversarial example in text. The attacks are harder to define and harder to defend.
- attacks
Adversarial Transferability: Why Black-Box Attacks Work at All
Adversarial examples transfer across models with different architectures and training sets. Understanding why changes what you think defenses need to accomplish.
- attacks
Model Inversion Attacks: Reconstructing Training Data from Model Outputs
From Fredrikson's pharmacogenetics exploit to Geiping's gradient inversion, model inversion attacks recover private training data in ways most ML engineers don't expect.
- red-team
GCG-Class Adversarial Suffix Attacks: A 2026 Practitioner Primer
The math, the cost curve, and why optimization-based attacks are now within reach of solo practitioners. With reproducible setup and what defenders actually need to do.
- attacks
Model Extraction via Query-Based Functional Stealing
Query-based model stealing attacks can recover a functionally equivalent model from API access alone. The economics matter more than the technique: here's when extraction is worth doing.