What's the pricing shape — is this a retainer, and what am I locked into?

It starts with a fixed-scope production audit so neither side commits blind. The standing engagement is monthly, sized to the system's real surface area — features, traffic, SLA tier — not a headcount you rent. It's month-to-month after stabilisation; no annual lock-in to start.

How fast do things happen, and what's your SLA?

The audit is days. Instrumentation and the first baseline land within the first weeks, so you see real numbers early. SLAs are agreed up front and tiered to what the system needs — response, resolution and uptime targets where we control enough of the stack to commit to one.

We won't claim uptime for parts of the stack outside our control, and we'll say so in the SLA. We won't run this as a ticket queue with juniors. And if the audit finds the architecture can't scale, we'll tell you it needs a rebuild instead of billing maintenance forever.

How is this different from your other services?

The other services get a system into production. Support & Scale is the only one whose entire job is after launch — keeping a live system reliable, cheap and current. Two tells: it's the one service we'll take on a system we didn't build, and it's measured in held SLAs and declining cost, not features shipped.

01 / Services / Support & Scale

Keep your production AI system boring.

You shipped it. Now it has to keep working at 10x the traffic, on next quarter's model, without a 2am page. That's a different job from building it — and the one most teams underfund.

Start a project

02/What you get

Everything you need.To stay boring.

No vanity dashboards. No assumed quality. Real numbers, gated changes and a reliability floor — so the system holds at 10x and the bill stops climbing.

A telemetry dashboard you own

Live cost, latency (p50/p95/p99), error and fallback rates per AI feature. You stop guessing about your own system.

An AI bill that stops climbing

Cost-per-request and cost-per-outcome tracked and driven down — or held flat as volume grows — with every optimization documented.

Quality measured, not assumed

An eval suite that gates every prompt and model change, plus sampled live scoring, so drift is a caught number instead of a customer email.

Model upgrades that don't break you

When a provider deprecates or releases, we re-run evals and ship the swap behind a flag — no quality regression, no scramble.

A reliability floor

Fallbacks, retries, graceful degradation, an on-call runbook, and a postmortem after every incident. Fewer pages, faster recovery.

Headroom proven before you need it

Load-tested capacity numbers and an infra and quota plan sized to where you're going, so the next 10x is a config change, not a rewrite.

p50/p95/p99

Latency tracked per AI feature

100%

Prompt and model changes eval-gated

Senior

Not juniors. Not a ticket queue. Principal engineers only.

Days

Audit turnaround; baseline in the first weeks

Monthly

Review of telemetry, what shipped and the next backlog

03/How we work

How we work.Audited, then operated.

Five steps from a cold production audit to a system that runs on a cadence — with real numbers before we touch anything.

01 · AUDIT

Production audit

A senior engineer reviews the live system: what's instrumented, cost-per-request, latency profile, eval coverage, failure modes, deprecation exposure. You get a written findings doc with risks ranked. We'll audit a system we didn't build.

02 · BASELINE

Instrument and baseline

We stand up the observability and eval harness so there are real numbers before we change anything. You can't claim a 40% cost cut without the baseline. This lands in days, not weeks.

03 · STABILIZE

Stabilise

We close the ranked risks from the audit — fallbacks, retries, the eval gates, the runbook, the obvious cost wins. The goal is to stop the bleeding and make incidents rare and recoverable.

04 · OPERATE

Operate on a cadence

Monitoring is live, evals gate every change, model upgrades are handled as they come, and a monthly review shows the telemetry, what shipped, and the next backlog. The SLA is in force.

05 · OPTIMIZE

Optimise and stay scale-ready

We keep driving cost-per-outcome down, hold latency as volume grows, and load-test ahead of known growth events. If you want to take operations in-house, the handover is part of the deal.

04/Who it's for