xcactus - How We Build: The delivery process in 2026

Why we're publishing this.

Most software houses describe their process in vague terms: 'agile methodology,' 'iterative development,' 'client-centric approach.' We did too, for years. But our process has fundamentally changed. AI isn't a tool we added, it restructured how we work. We think prospective clients (and the industry) deserve to see exactly what that looks like. This is the unedited version: what works, what doesn't, and where humans are still irreplaceable.

Phase 1: Discovery and requirements

What happens: Every client conversation is transcribed. An analyst reviews the transcript and produces a structured brief. That brief, combined with the raw transcript, feeds into an AI agent that generates a preliminary specification: user stories, functional requirements, edge cases, constraints, and open questions.

What the AI does well: It catches requirements that humans miss in conversation. When a client says "and it should work on mobile too" in passing at minute 47 of a call, that doesn't get lost in someone's notes. It becomes a formal requirement with implications traced across the spec.

It also identifies contradictions. "The system should support offline mode" + "all data must be real-time synchronized", a human analyst might not flag this until architecture. The AI flags it immediately.

What the AI gets wrong: Context and priority. The AI doesn't know that when the client mentioned "reporting module," they actually mean a simple CSV export, not a full BI dashboard. It doesn't sense that one stakeholder's requirement is actually a negotiating position, not a real need. A senior analyst reviews every generated spec and applies business judgment before anything moves forward.

What the client sees: A structured specification document, far more detailed than what most software houses produce at this stage, plus access to an AI agent that holds the entire specification. The client can have a conversation with this agent: challenge assumptions, ask "what if we changed X?", explore scope adjustments. The agent understands the full context and can explain how a change in one area affects others.

This replaces the traditional cycle of "send document, wait for comments, schedule a call, discuss, update document, repeat." The client gets answers in real time, and every interaction is captured and reflected in the living specification.

Phase 2: Specification and validation

What happens: Through a combination of additional calls and agent sessions, the specification is refined until the client confirms scope. Once the business specification is locked, mockups are generated, again, by an AI agent that understands the full specification context.

Why this matters: Traditional process: business analyst writes spec, sends to UI/UX designer, designer interprets spec, produces mockups, client reviews, finds misalignment, loop. Each handoff introduces interpretation drift.

Our process: the same system that holds the specification generates the mockups. There is no interpretation gap because there is no handoff. When the client says "this screen should show the order summary," the mockup reflects exactly what "order summary" means in the specification, not what a designer assumed it means.

What still requires humans: Visual design quality, brand alignment, and UX intuition. AI generates functional mockups that are structurally correct. A designer refines them into something that feels right. The AI ensures correctness; the human ensures craft.

Phase 3: Technical architecture and planning

What happens: With confirmed business requirements and mockups, the AI generates technical documentation: architecture decisions, technology choices, integration points, data models, API contracts. This documentation is automatically verified against the latest versions of frameworks, libraries, and cloud services being used, no more building on deprecated APIs because someone referenced six-month-old docs.

The project is then decomposed into a delivery schedule: phases, milestones, and individual changes. Each change gets a detailed implementation plan, not a vague user story, but a specific plan that describes what files will be modified, what tests will be written, what interfaces will change, and what dependencies exist.

The critical step most teams skip: Every implementation plan is analyzed for architectural consistency. Does this change contradict an earlier decision? Does it introduce a dependency that will complicate phase 3? Does it duplicate logic that already exists in the codebase? In traditional development, this review is supposed to happen in pull requests. In practice, reviewers skim because they're busy. Our system does it before a single line of code exists.

What requires human judgment: Technology selection at the strategic level. Trade-offs between build vs. buy. Decisions about what to defer. Estimating the "unknown unknowns", the risks that don't appear in any spec. An experienced technical lead reviews every architecture document and makes the calls that require intuition about systems, not just analysis of requirements.

Phase 4: Implementation

What happens: Approved implementation plans go to development. We use AI coding agents (Claude Code, Codex) orchestrated across the project. Each agent works within the boundaries of its implementation plan, it doesn't "vibe code" a feature from scratch. It executes a reviewed, approved plan with TDD: tests first, implementation second, validation third.

What makes this different from "developers using Copilot": Copilot suggests the next line. Our system executes a plan. The distinction matters enormously. A developer using Copilot still makes every structural decision: what to build, in what order, how to organize it, what to test. Our agents operate within a plan that has already been validated against the specification and architecture. The structural decisions were made in Phase 3 and reviewed by a human. The implementation is execution, not improvisation.

The procedures that AI actually follows: We've codified our engineering best practices into AI rules and skills, structured instructions that agents follow on every task. Test coverage requirements. Code organization patterns. Security checks. Documentation standards. Error handling conventions.

Here's the thing: these rules aren't new. We've had coding standards for years. Every software house does. The difference is that human developers treated them as guidelines. AI agents treat them as instructions. When the rule says "every public API endpoint must have input validation, error handling, and a corresponding integration test," the AI produces all three. Every time. A human developer under time pressure might skip the integration test and tell themselves they'll add it later. The AI doesn't have a "later."

What still requires humans: Code review of the output, especially at integration boundaries. Understanding whether the code is correct is one thing; understanding whether it's the right code for this specific business situation is another. Edge cases that weren't captured in the spec. Performance implications that only emerge under real load patterns. These are human judgments.

Phase 5: Testing and quality assurance

What happens: Tests are generated as part of implementation (TDD), not as a separate phase. The AI writes tests based on the specification, the implementation plan, and the actual code produced. Integration tests validate that components work together as the architecture defined.

What this solves: The eternal problem of "we don't have time for tests" disappears, because tests aren't a separate activity that competes for time. They're a structural part of how code is produced. You can't get the code without the tests, because the process generates them together.

What AI testing misses: Exploratory testing. The scenarios a human QA engineer finds by using the software like a real user and thinking "what if I do this weird thing?" AI tests what the spec says should work. Humans find what the spec didn't think to mention.

Phase 6: Documentation and delivery

What happens: Technical documentation exists from Phase 3 and has been updated through implementation. User-facing documentation is generated from the specification and validated against the delivered product. The client receives not just working software, but a complete documentation package, architecture, API references, deployment guides, user documentation, from day one.

Why this is unusual: In most projects, documentation is the thing that gets written "after launch" and then never actually gets written. Or it gets written once and immediately becomes outdated. Our documentation is a living artifact that's maintained by the same system that maintains the code. When a feature changes, its documentation changes with it.

The numbers.

We're a boutique team, roughly 20 people. With AI-augmented delivery, we produce output that would traditionally require significantly larger teams.

Across recent projects, we see consistent patterns:

30–50% reduction in delivery time compared to our own pre-AI baselines on comparable scope.

Fewer people per project. What used to need 5–6 specialists now needs 2–3, with AI handling the work that was previously distributed across junior developers, dedicated QA, and technical writers.

Dramatically lower specification drift. The number of "that's not what we meant" moments in UAT has dropped significantly, because the client interacted with the spec continuously throughout development, not just at kickoff and acceptance.

Documentation completeness from day one. This used to be aspirational. Now it's structural.

We're not claiming perfection. AI introduces its own failure modes: hallucinated requirements, overconfident implementation, subtle bugs that pass tests but miss intent. That's why every stage has human review. But the baseline quality, the floor of what gets delivered, is substantially higher than it was two years ago.

What we got wrong along the way.

This process wasn't built in a week. Some things we learned the hard way:

AI-generated specs can be confidently wrong. Early on, we trusted AI-generated specifications too much. The agent would produce a beautifully structured spec that was internally consistent but based on a misinterpretation of a client's casual comment. Now every AI-generated spec goes through senior analyst review with explicit focus on "is this actually what they meant?"

More automation requires more discipline, not less. When AI does 70% of the work, the remaining 30%, the human review, the judgment calls, the strategic decisions, becomes more important, not less. We had to learn that "AI handles it" is never a complete sentence.

Clients need to be onboarded to the process. Some clients initially found it strange to interact with an AI agent for requirements clarification. We learned to introduce this gradually and always with a human available. The agent is a tool for precision, not a replacement for the relationship.

The bottom line.

We don't claim that AI makes us magical. We claim that AI makes us consistent. It enforces the engineering discipline that we always aspired to but couldn't maintain under the pressure of real projects with real deadlines. If that sounds like what your projects need, not faster cowboys, but a more reliable process, that's the conversation we want to have.

How We Build: The xcactus delivery process in 2026