#software-development
#product
Research

DORA metrics are the wrong thermometer for 5-person teams — and what to measure instead

DORA became jargon. But the 4 metrics were designed for 50+ dev teams with mature pipelines. In a 5-person squad, they distort decisions. See the 4 alternative metrics Revin reports — calibrated to actual size.

https://images.prismic.io/revinsoftware/Z9XopjiBA97GihMR_victhor.jpeg?auto=format,compress

Por Victhor Araújo

Victhor Araújo

DORA became mandatory jargon in engineering conversations. Deploy frequency, lead time for changes, change failure rate, time to restore. Every founder heard it, every CTO cites it in board meetings. But most operate in 4-8 person squads — and these 4 metrics were designed for 50+ teams with mature pipelines.

In a small squad, DORA distorts more than it informs. Revin proposes 4 alternative metrics calibrated for 4-15 devs, which we report weekly across all clients. Not discarding DORA — using the right ruler for the actual team.

For CTOs and founders who adopted DORA because "the market uses it" and are making questionable decisions based on it in small squads.

High deploy frequency in a small team can mask insignificant releases

High deploy frequency in a small team can mask insignificant releases

🚧 Why DORA distorts in a small squad

Deploy frequency: maximizing incentivizes fragmentation

In big teams, high deploy frequency signals a mature pipeline. In a 5-person squad, making that metric the priority leads to cosmetic deploys (copy tweak, CSS adjustment) just to bump the number. Metric rises; product doesn't move.

Lead time for changes: high variance makes median useless

Big teams have a statistical distribution that makes median representative. A 5-person squad has 1-2 big features (4 weeks) + 10 small fixes (1 day). The median lies about reality.

Change failure rate: small sample becomes noise

With 8 deploys per month, 1 failure = 12.5% change failure rate. With 80 deploys, 1 failure = 1.25%. Same reality, 10x different numbers. Decisions on this metric in a small squad are decisions on noise.

Time to restore: rarely triggered in small teams

A small squad has sparse incidents. If you had 2 incidents in a quarter, "average time to restore" has N=2. Statistics with no weight for decisions.

📊 The 4 metrics Revin reports instead

1. Feature outcome rate

Of each feature shipped in the quarter, how many hit the business metric defined in scope? E.g., "feature X was supposed to lift conversion by 5% — did it?". Measures real outcome, not activity.

2. Rework rate (% of PRs reopened in 14 days)

A merged PR later reverted/fixed within 14 days indicates weak definition of done, shallow code review, or incomplete requirement. Direct quality metric, robust on small samples.

3. Cost per shipped feature (USD)

Sum of dev × hour + infra + rework ÷ features shipped in period. A metric the CFO understands. Robust because it aggregates dimensions.

4. Forecast accuracy (% of sprints closed on estimated date)

Senior squads estimate and hit; reactive squads estimate and rarely hit. Direct predictability metric, enormous value for external stakeholders.

4 metrics calibrated to the squad's real size: outcome, quality, cost, predictability

4 metrics calibrated to the squad's real size: outcome, quality, cost, predictability

🛠️ How to start measuring these 4

  • Outcome rate: requires feature scope to have a business metric defined BEFORE implementation. Tech lead enforces this pattern.
  • Rework rate: simple GitHub/GitLab script that detects PRs reverted or modified within 14 days.
  • Cost per feature: monthly spreadsheet with allocated hours + average cost + features shipped. No expensive tool needed.
  • Forecast accuracy: each sprint, planned vs. delivered. Accumulating across 8 sprints already gives statistical signal.

📢 Want to implement these 4 metrics in your squad in 2 weeks? Book a Diagnostic Sprint — Revin defines calculation, data source, and dashboard for each.

🎯 Conclusion: DORA is the right ruler for the wrong team

DORA is excellent for Google, Spotify, Netflix. It's inadequate for a 5-person squad. Senior squads use the right ruler for the actual size — they don't copy what's on the Valley's blog.

📢 Revin reports these 4 metrics with every client since 2024. See the delivery model.

Ready to elevate your business

Schedule a meeting
Share
Link de compartilhamento LinkedinLink de compartilhamento XLink de compartilhamento WhatsappLink de compartilhamento Facebook