Engineering notes on AI, ML, custom apps, and the messy bits of shipping production software.
A practical, ordered playbook for lowering production LLM costs. Prompt hygiene, caching, model routing, retrieval, budgets, and the measurement that ties it all together.
Hype-free engineering principles for AI products that serve users, not nudge them. Grounding, refusal, evals, cost-bounding, the boring decisions that actually ship.
A working model is not a working product. The full production ML stack, data pipelines, deployment patterns, monitoring, drift detection.