Block's Builderbot: What AI-Native Platform Teams Look Like

Most engineering leaders are still thinking about AI as a productivity layer sitting on top of existing workflows. Copilot for this engineer, Claude for that one. Individual speed gains that aggregate into marginal team improvements. Block just published a blueprint that makes that approach look like bringing a knife to a gunfight. Builderbot, Block's internal AI orchestration system, is now executing over 200,000 operations per day across the company's codebase and generating approximately 1,500 merged pull requests per week. That's roughly 15% of all production code changes at Block flowing through an automated agent layer. Not experiments. Not proofs of concept. Merged, production code. This is what an AI-native platform team looks like at scale, and every engineering leader building a roadmap for the next 18 months needs to study it carefully.

The Architecture: One Agent Layer, Not a Thousand Copilots

The key design decision Block made wasn't which LLM to use. It was where to put the intelligence. Rather than distributing AI tools across individual engineers' laptops, Block built a centralized AI orchestration layer wired directly into their monorepo, CI/CD pipeline, and internal communications. Engineers interact with Builderbot by tagging it in Slack with a description of the required work. The system then researches, plans, writes code, opens PRs, follows CI, and iterates based on feedback autonomously. The infrastructure underneath Builderbot is equally deliberate. It's built on goose, Block's open-source agent framework contributed to the Agentic AI Foundation under the Linux Foundation. It runs on the Model Context Protocol (MCP), co-developed by Block and Anthropic, which standardizes how AI agents connect to tools and data sources. This isn't a vendor integration bolted onto an existing workflow. It's purpose-built infrastructure for a world where agents are first-class actors in the software lifecycle.

We built an internal AI system called Builderbot. It coordinates agents across our entire codebase. Engineers tag it in Slack, and it researches, plans, and implements changes across dozens of services. Instead of scaling headcount linearly with product surface area, we're scaling an AI-native platform that lets very small teams own and operate massive systems.
— Owen Jennings, Head of Engineering at Square (part of Block)

That last sentence deserves to be read twice. Scaling an AI-native platform instead of headcount. That's the structural shift.

The Numbers Behind the Reorganization

Block's results are striking enough to force any skeptic to take this seriously. Over 90% of code submissions at Block are now partially or fully AI-authored. Median weekly code changes per engineer have increased by roughly 30%. And Block has 7,500 employees using AI tools weekly, with AI systems handling about 65% of Cash App support cases. This didn't happen without structural changes. Block reduced its workforce from over 10,000 to just under 6,000 employees as part of what it describes as an "AI overhaul," eliminating more than 4,000 roles. That's a significant and honest data point that deserves context rather than euphemism. Here's the context engineering leaders should hold onto: Block didn't cut engineers and hope productivity held. They built the AI platform first, measured what it could handle, and restructured around demonstrated capability. The throughput metrics came before the headcount decisions. That sequencing matters enormously. Organizations that cut engineers expecting AI to fill the gap without the infrastructure investment will not replicate Block's results. They'll just have fewer engineers. The other critical context: Block's ambitions have not shrunk. The same agent substrate that powers Builderbot also powers Square AI, Moneybot, and customer-facing support automation. Fewer people, handling more product surface area across more domains. That's the model.

The Hidden Multiplier: One Platform, Two Revenue Streams

Most coverage of Builderbot frames it as a clever internal engineering tool. That misses the more important insight for CTOs. Block is running a horizontal agent platform. The investment in goose, MCP, and orchestration infrastructure pays dividends in two directions simultaneously. Internally, it accelerates engineering throughput. Externally, the same substrate powers customer-facing AI products that generate revenue. One platform investment, two compounding returns. Block also built an internal text-to-persistent-app system called G2, which lets non-technical employees compose autonomous workflows from configurable tiles that run continuously and asynchronously. This is the capability that reduces engineering dependency for operational automation across support, finance, and ops teams. Engineers stop being the bottleneck for every internal automation request. Their time concentrates on higher-leverage work.

So, we have a tool called Builder Bot. Builder Bot is just autonomously merging PRs and actually like building features to 100%. We've had some fairly complex features that are built to 100%. More often than not, it's building them to 85 or 90%, and then a human who has a lot of context and understands does the final 10%.
— Owen Jennings, Head of Engineering at Square (part of Block)

The 85-90% figure is where the real organizational design question lives. Who owns that last 10-15%? And how do you structure a team around finishing what AI starts, rather than starting everything from scratch?

Safety and Guardrails: The Non-Negotiable Layer

Block is explicit about what Builderbot cannot touch. The system operates only on source code and system configuration, with no access to customer data, payment information, or personally identifiable information. That data-segregation decision isn't an afterthought. It's the design constraint that makes autonomous operation at this scale defensible to a board, to regulators, and to engineering teams that need to trust the system. This matters because the failure mode most engineering leaders fear isn't "Builderbot writes bad code." It's "Builderbot writes bad code in a payments system while touching customer data." Clear scope constraints are what prevent a centralized AI orchestration layer from becoming a centralized risk amplifier. The practical architecture for any team implementing something similar: define your agent's blast radius before you define its capabilities. What can it read? What can it write? What requires a human approval gate? Block's answers to those questions are publicly documented. Yours should be too, before you flip this on across your monorepo.

The New Team Shape

Concrete team structures matter more than abstract principles. Here's how Block's model translates into an org design framework:

Team Type	Pre-AI Model	AI-Native Model
Product squads	8-12 engineers per surface area	2-5 engineers + shared AI agents
Platform / DevEx	Supporting role, limited headcount	Core investment, owns agent infra
AI workflows	Doesn't exist	Dedicated: agent reliability, evaluation
Non-eng automation	Dependent on engineering	Self-serve via tools like G2

Product squads shrink. The platform team grows and becomes central to the entire organization's operating model, not just engineering. This mirrors how great DevOps and platform engineering teams work today, except the surface area they own now includes the AI agents generating production code. The roles that are emerging at companies like Block aren't replacing engineers. They require engineers with a different profile:

AI Workflow Engineer

designs and maintains the orchestration logic connecting agents to codebases, CI systems, and communication tools

Agent Reliability Engineer

builds eval harnesses, monitors agent output quality, owns SLAs for automated code changes

AI Change-Risk Reviewer

specializes in reviewing machine-authored PRs at scale, with tooling and checklists tailored to AI failure modes rather than human failure modes

AI Platform Architect

owns the goose/MCP-equivalent layer and ensures it serves both internal engineering and external product surfaces

These roles don't appear on any hiring platform built for the pre-AI world. They require a fundamentally different intake process, different signal collection, and different evaluation criteria than what legacy platforms were designed to surface.

Your Implementation Roadmap

If Block's model is the destination, here's a pragmatic path that doesn't require betting the company on day one:

Start with constrained domains. Migrations, boilerplate generation, well-typed API plumbing, dependency updates. These are high-volume, low-ambiguity tasks where agent failure modes are visible and recoverable. Builderbot's most autonomous wins are almost certainly in structured, well-understood work.

Build observability before scale. You need to know what your agent is doing before it's doing 1,500 PRs a week. Instrument every step: how many changes proposed, how many merged unmodified, how many required human revision, and where revisions concentrated. That data shapes your blast radius policy.

Set explicit throughput targets. Block measures 15% of production changes flowing through automation. What's your year-one target? 5%? Set the number, measure against it, and adjust your platform investment accordingly. Vague aspirations don't move org structures.

Hire for the platform role before you hire for product growth. The leverage comes from the shared infrastructure. One strong AI platform team serves ten product squads. Ten product squads each hiring one AI enthusiast gives you fragmented experiments and no compounding returns.

Define your data segregation policy on day one. What can your agents read? What can they write? What requires human sign-off? Document it. Enforce it in the tooling. Block's explicit "no customer data" constraint is part of why they can run Builderbot autonomously at this scale without a compliance crisis.

The Org That Fights on More Fronts

Here's the frame that engineering leaders should carry into their next board conversation: individual teams are shrinking, but total engineering ambition is expanding. Block isn't running fewer products. It's running the same products with smaller teams while building the AI infrastructure to take on more. The analogy holds: elite Navy SEAL units are small, but you deploy more of them across more theaters because each unit's output is dramatically higher. The overall military doesn't shrink. It fights on more fronts simultaneously. The engineering leaders who will win the next decade aren't the ones who figure out how to do more with fewer engineers. They're the ones who figure out how to do vastly more with the same number of engineers, and then decide which new fronts to open. Block's Builderbot is generating 15% of production code today. The engineers who built it are spending their time on the 10% that requires judgment, architecture, and context that agents can't replicate. That's the job description for every great engineer in 2026. And finding engineers who can operate that way, at that level, in that kind of AI-native environment, is the hardest hiring problem in tech right now. The platforms built to solve that problem for the pre-AI world are not going to solve it for you.

Nextdev