What this site is for

Adversarial ML exists to cover the field with the rigor a working ML engineer or red teamer would expect — and the same honesty about which results actually transfer from a paper’s threat model to deployed systems.

What we publish:

Working attacks against deployed models. Membership inference, model extraction, evasion (gradient-based and query-based), training-data extraction, backdoors, and the multimodal variants. Where the model is open, reproducible PoCs. Where it’s closed, behavioral analysis with primary-source citations.

Defenses, evaluated under their stated threat model. Adversarial training, certified robustness, input transforms, detection-based defenses — and the conditions under which each one breaks. Most “defenses” published in the literature don’t survive an adaptive attacker; we say which ones do.

The gap between the lab and production. Most adversarial-ML papers assume access models — white-box, query budget, no rate limiting — that look nothing like a real production endpoint. We cover the realistic threat model: the API in front of the model, the queries-per-second cap, the abuse signals, the cost-of-attack tradeoff for the adversary. That’s where adversarial ML becomes a security problem rather than a robustness paper.

Tooling. Honest takes on adversarial-ML tooling — the libraries (CleverHans, ART, Foolbox, TextAttack), the evaluation suites, what each one is good for and where it falls short.

What we don’t publish:

Press release rewrites
“Top 10” listicles
Vendor-funded “research” with undisclosed conflicts of interest
Anything we can’t source to primary material

Bylines are pseudonymous. The work is the point. Tips, attack reports, and corrections to the editor.

Real coverage starts shortly.

Adversarial ML — in your inbox

Related

GCG-Class Adversarial Suffix Attacks: A 2026 Practitioner Primer

Comments