The constraint
We're a small studio. We can't afford a dedicated SRE team, but we also can't afford a production incident on every push.
So we've built a playbook that lets two engineers safely ship five times a day across a dozen Laravel and Next.js projects. Here's the shape of it.
Branches are cheap, environments are not
Every PR gets a preview URL. Vercel for Next.js, a containerised review app for Laravel. Reviewers don't read diffs — they click around.
Migrations are reversible until they're not
Every migration has a tested down(). If we ship a migration we can't roll back (a column drop, a data transform), it gets a special review tag and runs separately from the code deploy.
The two-key deploy
Production deploys require a green CI and a human ack in our deploy channel. The friction is intentional. It's not slow — usually 90 seconds — but it forces a pause.
Feature flags > big-bang releases
Anything risky goes behind a flag. We can turn it off without a redeploy. We've used this dozens of times to dodge incidents.
Post-incident, not post-mortem
When something does break, we write a short doc — what broke, why, what's now different — and link it in the project README. No blame. The doc is the artefact.
What we don't do
- We don't write 200-line CI pipelines. Ours are usually under 40.
- We don't aim for zero downtime on every project — only the ones that need it.
- We don't try to be Google. We're two engineers shipping for Kenyan SMEs. The standards match the stakes.
This stuff is boring. That's exactly the point.