0813 min read · 180 XP

AI security & red-teaming

Non-negotiable at an insurer. Guardrails first, evangelism second.

AI adds new attack surfaces on top of normal appsec. One PII-in-a-prompt incident can set a whole program back. Know the OWASP LLM risks, defend against prompt injection, and red-team your own systems before someone else does.

Key ideas

1
Learn the OWASP Top 10 for LLM Applications — prompt injection, sensitive-data disclosure, insecure output handling, supply-chain, excessive agency, and more.
2
Prompt injection (direct & indirect): untrusted text — including retrieved docs and tool outputs — can hijack the model. Never trust model output as a command.
3
Protect data: keep PII/secrets out of prompts, logs and training; enforce least-privilege on tools and retrieval; redact at the gateway.
4
Constrain 'agency': the more actions an agent can take, the bigger the blast radius — scope tools, require approval for irreversible actions.
5
Red-team proactively: try to break your own system, document findings, and turn them into standards. Partner with security as an ally, early.

Top risks to internalize

Indirect prompt injection via RAG content or tool results.
Sensitive data leakage through prompts, logs, or over-broad retrieval.
Insecure output handling (model output used in SQL, shell, HTML → injection).
Excessive agency / over-permissioned tools.
Supply chain: untrusted models, datasets and plugins.

Run a red-team exercise

Define targets (data exfiltration, unauthorized actions, jailbreak bypass of policy).
Attack: injection payloads in inputs/docs, role-play jailbreaks, tool abuse.
Record what worked; add guardrails + eval checks; re-test.
Publish a remediation checklist as a reusable standard.

Watch

Explained: the OWASP Top 10 for LLM Applications

▶Prompt injection: attacks & defensesFind it on YouTube →

Do the work

0/5 · 0%

Test yourself

Question 1 / 3

What is indirect prompt injection?

27 chapters · progress saves automatically