Tag
#llm-security
2 posts tagged llm-security.
- attacks
Adversarial Attacks on Vision-Language Models: CLIP, LLaVA, GPT-4
Vision-language models expand the adversarial attack surface beyond image classifiers: adversarial images can manipulate text outputs, carry visual
- attacks
Training Data Extraction from LLMs: The Carlini Results Explained
Carlini et al. demonstrated verbatim extraction of training data from GPT-2. The results have been widely misread.