Methodology

How a 90-Day Pre-ATO Sprint Actually Runs

Most federal engineering engagements fail at planning, not execution. This is how ACG structures a deadline-driven infrastructure sprint — what happens each week, what decisions get made when, and where the work tends to actually break. It's not a framework diagram. It's the playbook I'd give someone who had to run one of these tomorrow.

Precondition: only run this if you have to

Before anything else: a 90-day pre-ATO sprint is an emergency response, not a delivery model. The right version of this work is a 120–180 day engagement with slack in it. If you have the option, run that instead. If you don't — if you have a review date and a non-compliant workload and you can't move the review — then the next paragraphs are for you.

Week 0 — the kickoff that isn't a kickoff

The single most common failure mode is a kickoff week spent on slide decks, org-chart introductions, and charter documents. Week 0 in a real sprint is a working session. The goals are concrete and there are four of them:

  • Decide the target architecture. Not "explore options." Decide. Serverless vs. containers. Managed database vs. self-managed. What's in the boundary, what's out. This decision is made in a single working day with the compliance lead in the room, and it is revisited only if it fails technical validation — not if someone has second thoughts.
  • Map the descope opportunities. Every component you don't migrate is a control family you don't have to evidence. Find the ones you can eliminate. Kubernetes → Lambda is the prototype example; almost every migration has an equivalent.
  • Agree on what the compliance team will receive. Terraform codebase, control-mapping document, runbooks, architecture diagrams, evidence pipelines. Agree what each artifact looks like, who owns it, and when it's delivered. Agree this in writing before any code is written.
  • Lock the scope. Write down what's in, what's out, and what the definition of "pre-ATO delivered" means. Attach it to the engagement. Refer back to it every Monday.

By the end of week 0, the engineering team should be able to start building. If it can't, the engagement is already behind.

Weeks 1–2 — build the skeleton, not the details

The first two weeks are about proving the architecture end-to-end before filling it in. A skeleton Lambda, a skeleton API Gateway route, a skeleton Aurora cluster, a skeleton Terraform layout, a skeleton CI/CD pipeline. Every service that's going to be in the target architecture gets a minimum-viable instance of itself in Terraform, and the pipeline is already pushing from a mirrored GitHub repo into GovCloud.

The point of the skeleton is to find the integration problems before you're under deadline pressure. GovCloud has partition-specific quirks (IAM partition ARNs, service endpoint names, CodePipeline-to-GitHub source limitations, CloudFront behavior, ACM certificate regions) that will cost you a week of debugging if you discover them in week 10. Discover them in week 2 and they cost you a day.

The compliance team gets the first version of the Terraform codebase at the end of week 2. Not polished. Not complete. Real, so they can start reading it and asking questions.

Weeks 3–6 — the build

This is where the actual work happens: migrating the application logic, wiring up the data stores, building the authorization flows, implementing the encryption boundaries, writing the observability stack, and producing the technical control artifacts in parallel. Every day ships something. Every week ends with a working deployment to a pre-production environment in GovCloud.

The compliance team gets weekly updates on what's implementable and what isn't. The engineering side pushes back on controls that can't be evidenced in the architecture as designed, and the compliance side pushes back on architectural choices that don't map cleanly to the control framework. This is a conversation, not a handoff. The best pre-ATO sprints have the compliance lead and the engineering lead in the same weekly working session, and the worst ones have them communicating through tickets.

Toil gets automated the first time it shows up. Alerts get tuned to actionable-only the first time a false positive happens. The operational posture of the production environment is being built alongside the infrastructure — not after.

Weeks 7–9 — cutover and handoff

By week 7 the target environment is feature-complete and under load. The next three weeks are cutover, cutover rehearsal, and handoff to the compliance team.

Cutover rehearsal matters because the production cutover is the single riskiest event in the engagement. Rehearsal happens in a pre-production environment that's configured as closely to production as possible. The rehearsal finds the things that aren't in the runbook — DNS TTLs, caching layers, clients with hard-coded endpoints, certificate pinning, anything that talks to an external service — and those things go into the real runbook before cutover day.

The compliance handoff runs in parallel. The Terraform codebase, the control mapping document, the evidence pipelines, the runbooks, and the architecture documentation all get delivered to the compliance lead with enough time for them to build their final A&A package without rushing.

Week 10 — cutover and the buffer

Cutover happens in week 10. Week 11 and 12 exist for one reason: the things you didn't plan for. They happen every time. Plan the work for weeks 0–9 and leave weeks 10–12 as buffer. If nothing goes wrong, those weeks are spent on post-cutover operational tuning, runbook revisions, and a small list of nice-to-haves. If something goes wrong, those weeks are the difference between hitting the review date and missing it.

What tends to actually break

  • The CI/CD pipeline into GovCloud. GovCloud has partition-specific behavior around source connections, IAM, and CodePipeline source actions that bite every project at least once. Budget a day to debug this in week 2 and don't be surprised when it's a day and a half.
  • Cross-partition DNS and certificate management. ACM certificates for CloudFront must live in us-east-1; GovCloud workloads live in a different partition. The coordination between those two paths is a real gotcha.
  • Compliance-team capacity. If the compliance lead is part-time on your engagement or fielding three other A&A packages in parallel, the handoff will not happen on schedule. The answer is to agree to a weekly working session with the compliance lead as a hard commitment in week 0, not to assume you can negotiate it later.
  • Scope creep disguised as "just one more thing." A 90-day sprint has zero tolerance for scope expansion. The answer is the scope document from week 0 and a weekly review against it. Things that don't make the cut go into a follow-up backlog.

What this approach is not

This is the structure I use when the clock is the primary constraint and the outcome is a defensible A&A package. For engagements that aren't deadline-driven, the right approach is slower, more exploratory, more iterative — closer to Build for Change and Great is the Enemy of Good on the homepage. And if you're still wondering whether a 90-day sprint is coming for you, start the 120-day version today, before it becomes the 90-day version.

Talk to ACG about an engagement