Blog Post

Zero-Downtime Migration: A Practical Cutover Plan

October 11, 2025 Web Hosting by Elio

A true zero-downtime migration keeps users transacting while you move an application, database, or entire site to new infrastructure. The padlock stays green, sessions remain valid, carts don’t empty themselves, and search crawlers continue to see a healthy origin. Achieving that result isn’t magic; it’s the outcome of a precise cutover plan that sequences replication, routing, and validation in the right order.

Define the target and the risk window

Begin by writing down what “zero” means for your project. For a content site, a few stale assets might be acceptable; for a checkout flow, even a brief write failure is not. Note the SLOs you must protect—availability, error rate, and latency—and the transactions that matter most. With that scope set, choose the migration weekend or off-peak window that lowers concurrent write pressure without turning this into a 3 a.m. fire drill.

Build a production-grade twin

Provision the new environment so that it mirrors production: same runtime, extensions, TLS, worker counts, cache layers, and file paths. Keep secrets, API keys, and environment variables in place, but fence outbound calls so the twin doesn’t accidentally send real emails or charge real cards during rehearsals. If your application relies on a CDN or WAF, wire the twin to staging subdomains that replicate the production policies.

Sync code and assets continuously

Do an initial copy of the application code and user-generated assets to the new origin, then keep them in sync until cutover. For traditional file trees, rsync with checksums and delete flags keeps parity accurate:

rsync -aH –delete –info=progress2 /var/www/ app@new-origin:/var/www/
rsync -aH –delete /var/uploads/ app@new-origin:/var/uploads/

If assets live in object storage, enable bucket replication or run periodic sync jobs so that a late-night upload doesn’t vanish at cutover.

Choose a database strategy that fits your writes

Databases decide whether “zero” is truly zero. For MySQL or MariaDB, set up a replica that follows the primary’s binary logs (row-based replication is safer for edge cases). For PostgreSQL, use streaming replication with physical slots or a logical replication tool when schemas are evolving. Take a fresh base backup, start replication, and let it catch up until the replica’s delay hovers near zero. If you must run schema changes that aren’t backward compatible, plan a brief application quiesce where writes are paused while the migration runs; for many systems that pause is measured in seconds when you prepare the DDL work ahead of time.

Lower DNS TTL early and stage your routing

Days before cutover, reduce the DNS TTL on the production hostname to something short—five minutes is a common choice—so that caches expire quickly when you flip. If you control routing through a load balancer, traffic manager, or CDN, prepare a blue-green switch: the “blue” stack serves production; the “green” stack is your twin. Weighted routing or origin failover lets you canary a small percentage of users before shifting everyone.

Warm the caches so the first impression is fast

An empty cache will make a perfect migration feel slow. Prime your application cache with the hottest pages and endpoints, and pre-warm the CDN by fetching representative URLs from multiple regions. If you use database query caching or object caches, start the green stack under a synthetic load so it builds the same hot sets blue has built over time.

Rehearse the entire move

Run a full dress rehearsal with a recent production snapshot. Point a private hostname at green, walk through login, search, add-to-cart, checkout, and any critical admin flows, and exercise your observability stack. Treat this as a real event: log timestamps, record commands, and write down every fix so the runbook improves before the actual day.

Execute the cutover as a deliberate sequence

Freeze deploys to avoid surprises, ensure replication is healthy, and drain background jobs that create new writes. Switch the application to read-only for just long enough to make the final incremental sync. Promote the replica to primary on the green side, update connection strings or endpoints, and restore write access. Flip routing—either by changing the load-balancer target, adjusting CDN origin, or updating the DNS A/AAAA—then hold your position and watch the gauges. Error rate, p95 latency, and queue depths are your first truths; logs should confirm that real users are logging in, posting orders, and receiving confirmations without anomalies.

Verify what users actually do

Synthetic monitoring catches the basics, but real confidence comes from live journeys. Open an incognito session and complete the flows your customers use most often. Check outbound integrations—payments, email, shipping, analytics—and confirm that webhooks arrive at the new endpoints. Crawl a set of top pages to ensure that canonical tags, sitemaps, robots rules, and redirects still behave as they did on blue.

Keep rollback practical and close

A rollback isn’t failure; it’s an insurance policy. Keep the blue environment warm and continue to accept reads for a short window. If hard errors rise or a critical integration misbehaves, reverse the routing change and point writes back to blue. Because writes occurred on green during the attempt, document how you will merge or replay them later; in many stacks that’s as simple as resubmitting queued jobs or exporting a narrow slice of rows.

Close out with discipline

Once green has served real traffic cleanly through your observation window, retire blue methodically. Remove write privileges, revoke credentials, and schedule decommissioning after your post-mortem. Return DNS TTLs to sane values, remove any temporary feature flags, and update runbooks so the next migration benefits from this one. Share a short internal summary with timing, metrics, and the few issues you fixed along the way; that document becomes your template.

A brief note on common pitfalls

Most migration pain comes from three sources: hidden writes you forgot to quiesce (background cron jobs, webhook retries, third-party apps), caches that weren’t warmed (the new stack looks slow even though it’s healthy), and TTLs left high (clients continue to visit blue for hours). Each has a simple antidote when you plan ahead.

Zero-Downtime Migration: A Step-by-Step Cutover Plan isn’t just a title for search; it’s a promise you can keep. Build a faithful twin, replicate relentlessly, route traffic with intention, and watch the right dials while you shift. Do those things in that order and users won’t notice they moved house—even if you moved an entire neighborhood under their feet.

Tags: Zero-Downtime Migration