Skip to content

~/courses/ai-agent-security-redteam

Security Complete

AI / Agent Security — Red Team & Hardening

Learn to break and bulletproof AI agents through ethical, authorized red-teaming — then build the guardrails and evals that keep them defensible.

// The loop

learn → point at a safe sandbox agent → attack it → measure attack-success-rate → harden with a guardrail → re-test → report → save as a reusable eval

// The 6-phase roadmap

  1. 01 LLM threat model & OWASP LLM Top 10
  2. 02 Prompt injection & jailbreak red team
  3. 03 Tool, MCP & agentic abuse
  4. 04 RAG & data-layer security
  5. 05 Guardrail engineering
  6. 06 Eval & red-team harness

The course for the engineer who can both break and bulletproof AI agents. You build a safe, local sandbox agent, attack it across the LLM threat surface — prompt injection, jailbreaks, tool/MCP abuse, RAG data leakage — measure how often the attacks land, then engineer guardrails and re-test until the numbers move.

The goal is hardening the things you build, not weaponizing attacks. Red-team only your own sandbox, systems you own, or explicitly authorized targets under written scope. Payloads stay in the lab; disclosure is responsible. The deliverable is an automated red-team eval harness you can re-run on every change.


More in Security

Track overview