0614 min read · 180 XP

AI system architecture

Compound AI systems are distributed systems.

Real AI features are systems — retrieval + models + tools + business logic + guardrails + observability. You'll be judged across teams on architecture, not prompts. Learn the reference shape and the failure/▸degradation patterns.

Key ideas

1
Think 'compound AI system': orchestration around the model, not the model alone. Most value and most bugs live in the surrounding system.
2
Reference shape: client → gateway → orchestration → retrieval → model(s) → guardrails → observability, with caching and fallbacks throughout.
3
Design for non-determinism and failure: timeouts, retries, fallbacks (cheaper model / cached answer / graceful 'I don't know'), idempotency.
4
Separate concerns so pieces are swappable: model provider, retrieval, prompts and tools should each be replaceable behind interfaces.
5
Bake in data boundaries, PII handling and auditability from the start — far cheaper than retrofitting in a regulated org.

The reference architecture

Gateway: one governed entry point (routing, auth, rate limits, cost caps, logging, PII redaction).
Orchestration: prompt assembly, chaining, tool/agent control, retries & fallbacks.
Retrieval & data: vector/keyword stores, access-controlled, with freshness.
Guardrails: input/output validation, injection defense, policy checks.
Observability: tracing, quality/cost/latency metrics, feedback capture.

Cross-team leverage

Publish ONE reference architecture as the default 'paved road' so teams don't each reinvent it.
Review 2–3 teams' designs against it; document the diffs and patterns.
Make the model provider and retrieval swappable — vendor lock-in is a real risk.

Watch

Building LLM applications for productionChip Huyen

The LLM sandwich: the data layer around LLMsChip Huyen

Do the work

0/4 · 0%

Test yourself

Question 1 / 3

Why call modern AI features 'compound AI systems'?

27 chapters · progress saves automatically