
Victhor Araújo
In 2025, three public Brazilian cases involved companies that discovered, on incident day, that the backup configured 18-24 months ago didn't work: corrupted file, incompatible schema, or simply hadn't run in the last 60 days without anyone noticing. Recovery that should take hours took weeks — or never happened.
The rule is simple: an untested backup isn't a backup, it's operational fiction. A senior squad runs a quarterly validation protocol with every client — 4 steps, 4 hours, once every 3 months. Revin runs this since 2023 and publishes the checklist for any client to replicate.
For CTOs, ops heads, and founders who assume backup is fine because 'we set it up a while ago' — without having tested in the last year.

Real restore in isolated environment is the only test that counts — not just job log check
Inventory of everything that would need restoring in an incident: transactional databases, blob storage with client data, infra configuration (IaC, secrets), client email history, code repos (yes, GitHub goes down too).
Output: list prioritized by criticality (P0 = stops business, P1 = degrades operation, P2 = inconvenience).
Not just checking the backup job log. Real restore into an isolated environment (anonymized staging or ephemeral environment created for the test). Validate data integrity, schema compatibility, application reads.
Common mistake: assuming 'job ran successfully' means 'backup works'. It doesn't. File can be corrupted, format can be old, dependency can be missing.
RPO (Recovery Point Objective): how much data is lost between the last backup and the incident. If backup is daily at 3am and incident is 5pm, RPO = 14h of lost data.
RTO (Recovery Time Objective): how long restore takes. Timed in the test. If it took 6h in an isolated environment, in production under pressure it'll take 8-10h.
Compare to business expectation: would the CFO expect 30 min RTO? Does current backup deliver 8h? Documented gap.
Test output: 1 page with: what was tested, what worked, what failed, next actions before the next test.
If it failed: immediate allocation to fix. It's not 'we'll look' — it's P0 until next quarter.

A senior squad validates backup every quarter in 4 steps — public checklist available
Across all Revin clients, the 4-step protocol runs automatically on the calendar (quarterly). Tech lead facilitates, 2 seniors present. Output goes to the client as a report. If something failed, P0 opens in the backlog before the next sprint.
📢 Want to run this protocol on your current system? Book a Diagnostic Sprint — Revin executes the first cycle in 1 week and delivers checklist + report to repeat quarterly.
Configuring backup is a 1-day task. Validating quarterly is ongoing practice. Senior squads run both; generic squads run only the first and discover the mistake on incident day.
📢 See Revin's Security Foundations model — backup validation is part of the scope.
6 read minutes
Article content: