// Hacker Noon · 6 March 2026
Prompt Injection Still Beats Production LLMs
Three things we learned running a two-stage SFT+GRPO safety fine-tuning pipeline on Ministral-3B (single H200, 7.5 hours, 8,344 prompts from 19 security datasets): Train only what you’re adding. SFT on malicious examples only. Don’t retrain benign behavior the base model already has. Result: 100% be...
Hacker Noon
@hacker-noon · Evangelos Pappas

hackernoon.com
Read Full Article at hackernoon.comHacker Noon@hacker-noon
Discussion 0
Loading
Got something to say?
or to join the conversation.