AI system architecture
Compound AI systems are distributed systems.
Real AI features are systems — retrieval + models + tools + business logic + guardrails + observability. You'll be judged across teams on architecture, not prompts. Learn the reference shape and the failure/▸degradation patterns.
Key ideas
- 1
Think 'compound AI system': orchestration around the model, not the model alone. Most value and most bugs live in the surrounding system.
- 2
Reference shape: client → gateway → orchestration → retrieval → model(s) → guardrails → observability, with caching and fallbacks throughout.
- 3
Design for non-determinism and failure: timeouts, retries, fallbacks (cheaper model / cached answer / graceful 'I don't know'), idempotency.
- 4
Separate concerns so pieces are swappable: model provider, retrieval, prompts and tools should each be replaceable behind interfaces.
- 5
Bake in data boundaries, PII handling and auditability from the start — far cheaper than retrofitting in a regulated org.
The reference architecture
- Gateway: one governed entry point (routing, auth, rate limits, cost caps, logging, PII redaction).
- Orchestration: prompt assembly, chaining, tool/agent control, retries & fallbacks.
- Retrieval & data: vector/keyword stores, access-controlled, with freshness.
- Guardrails: input/output validation, injection defense, policy checks.
- Observability: tracing, quality/cost/latency metrics, feedback capture.
Cross-team leverage
- Publish ONE reference architecture as the default 'paved road' so teams don't each reinvent it.
- Review 2–3 teams' designs against it; document the diffs and patterns.
- Make the model provider and retrieval swappable — vendor lock-in is a real risk.
Watch
Do the work
0/4 · 0%Test yourself
Why call modern AI features 'compound AI systems'?
27 chapters · progress saves automatically