← Writing

Claude Code, Google Antigravity, and shipping production AI software in 2026.

May 2026 · 9 min read

I have shipped two halves of a production AI agent platform with two different agentic IDEs. The backend — about seventeen thousand lines of Python orchestrating a five-path LangGraph workflow, an MCP server, a Qdrant RAG layer, and a multi-provider LLM abstraction — I built from the first commit with Anthropic's Claude Code. The frontend — a React 19 widget designed to embed into INDATA's flagship Architect AI™ web application — I built first with Google Antigravity, then migrated to Claude Code earlier this year. The same codebase, two different tools, in sequence.

I don't think a lot of people in 2026 have done this with real stakes attached, so I want to write down what I learned while it's still fresh. The audience I have in mind is the senior engineer or engineering leader trying to make a tool call for their team and tired of hot takes from people who used each for an afternoon.

What follows is one developer's experience on one project, working alone, in a specific industry (buy-side investment-management software). Treat it as a single data point. I'll mark places where my conclusions might not generalize.

A summary, for people who want to leave now

Both tools shipped real software. The backend was built from the first commit with Claude Code, run from terminal windows inside VS Code. The frontend I deliberately built with Antigravity as a test drive of Google's new agentic IDE — and it worked so well that the prototype became the production frontend. I only started touching that frontend with Claude Code in the past week, so this essay isn't the “I evaluated and chose” story it might look like. It's the “I shipped real work with both, the transition is happening right now, here's what I'm noticing” story.

That distinction matters. I am not in a position to tell you which tool is better — my data points on Claude Code in the frontend are still measured in days. What I can do is tell you what each tool did well during a substantial production build, and what made me start moving from one to the other.

What I built, briefly

The system is INDATA Nexus — an AI Platform for buy-side investment-management firms. Portfolio managers, traders, compliance officers and operations staff interact with their holdings, transactions and performance data in natural language. There's a five-path agent that classifies intent and routes accordingly; a RAG layer over schema; an MCP server that publishes read-only data tools for external clients; and an embeddable React widget that lives inside our flagship Architect AI host application. I built and maintain the entire thing solo. It launched publicly on May 12, 2026.

Backend: Python 3.11, FastAPI, LangGraph, Anthropic + OpenAI SDKs, Qdrant, Redis, SQL Server. Frontend: React 19, TypeScript, Vite, Tailwind, AG Grid, Recharts. Roughly seventeen thousand lines of Python and six and a half thousand lines of TypeScript at last count, excluding tests, generated artifacts, and the deployment guides.

What Claude Code does well in production

The thing I underestimated until I had been using it daily for months is the value of a coding agent that lives in the terminal, not in an IDE chrome. Claude Code is a command-line tool. It reads and writes the same files my editor reads and writes, runs the same commands I would run, and treats my project as a long-running conversation rather than a series of independent prompts. The friction of context-switching out of my normal workflow is essentially zero.

The pattern that grew on me hardest is the memory file system. In the Nexus repo I keep three files — coding_agent_memory/PROJECT_STATUS.md, DECISIONS_LOG.md, and ISSUES_RESOLVED.md — and a CLAUDE.md at the project root that tells Claude Code how to behave in the repo. Twenty architectural decisions logged with dates and rationale; seven non-trivial bug fixes documented; the project status updated continuously. When I open a new session weeks later, Claude reads those files first and arrives oriented. The cumulative effect is that my project has a memory longer than any individual conversation, and I almost never re-litigate settled decisions or re-solve solved problems.

The second pattern that mattered is that Claude Code does architectural pushback in a useful way. Early in the Nexus build I tried to hard-code a function to extract stock tickers from chain query results — a regex over the result rows. Claude's response was effectively, “Production agent frameworks don't extract structured data with regex from LLM-adjacent code. Use the LLM with a tool schema for argument extraction; here's the pattern from the OpenAI function-calling docs.” It logged the decision into DECISIONS_LOG.mdas D032. Six months later that decision still holds. I'd guess this kind of principled feedback only works when the agent has a CLAUDE.md that tells it what production standards to apply, and when you give it the right level of trust to push back. But when it works, it's like having a thoughtful senior engineer in the loop.

The third thing is MCP. Claude Code natively understands the Model Context Protocol, which meant I could expose INDATA's internal tools to it during development the same way I'd expose them to a customer's AI agent later. Eating my own dogfood with no adapter layer accelerated the entire MCP server design — I was the first MCP client of the MCP server I was building.

How I ended up using Antigravity in the first place

I'd been building the backend exclusively in Claude Code for months. When it came time to start the frontend, I had a choice that wasn't obviously settled. I wanted to try several of the coding agents that had emerged in the previous year — Cursor, Copilot, Codex, OpenCode, and so on — and Google had just released Antigravity with significant fanfare. I've always had respect for Google's engineering, and I knew they couldn't be counted out in the broader competition over AI tooling. Trying Antigravity on a real piece of work seemed more honest than judging it from afar.

There was also a structural reason to be willing to experiment: I wasn't sure whether the frontend I was about to build would be the production UI or a throwaway prototype. The v1 product had been built by one of our developers working with me, and it was plausible that v2's UI would follow the same path. If Antigravity turned out to be a dead end, the cost was a couple of weeks and a learning experience. If it turned out well, I'd have something to show.

I told Antigravity I wanted a widget — something that could be plugged into our host web application or run as a standalone surface. Its Gemini-backed agent proposed React with Tailwind, set up the Vite build pipeline, and scaffolded the entire project structure in a session. Within a day I had a runnable app. The stack recommendation was the right one; it's what I'm still using six months later.

What Antigravity did well

The single most useful thing it did, and the one I will defend unreservedly, was its handling of design tokens through MCP. INDATA had recently engaged a professional design firm to redesign the host application; the deliverables came as Figma files. I installed Figma locally on my development machine, ran its MCP server in stdio mode, and connected Antigravity to it as an MCP client. The agent could then read the design firm's style guides, color tokens, typography, and component specifications directly — without my having to translate any of it by hand. The frontend that emerged inherited those design tokens cleanly, and the host application could pass them through at runtime via CSS variables. Two distinct MCP servers working together — Figma's and ours — bracketing the build from both ends.

I also remember the moment the work crossed from prototype to production. I demoed it to our internal development team — the people who would have built it themselves if I hadn't — and asked whether it could serve as a reference design for them to work against. The response was something like, “this looks great, why not just use this?” That is the social validation moment that tells you a tool earned its place in the stack, not a feature comparison or a benchmark. It worked well enough that experienced engineers chose not to rebuild it.

One more thing worth saying clearly: Google's ecosystem integration is real. If your team is on Google Cloud, building against Gemini, working in BigQuery, or shipping into Workspace, Antigravity is doing things no other agentic IDE can match. The decision in this essay reflects my situation, not a universal ranking.

Why I started moving the frontend to Claude Code

Two things converged in the last few weeks that made me start touching the frontend with Claude Code instead of Antigravity. They weren't complaints exactly — more like a sense that the trade-off had shifted.

The first was a direction shift at Google. Antigravity's UI was recently restructured to feel less like a traditional IDE and more like an agentic programming front-end of its own — closer to a chat-driven assistant surface than the familiar IDE shell I started with. There's a defensible product thesis behind the change, but the new surface didn't fit how I work. I want my agent to live in the terminal next to the same editor I've used for years, not to become the editor.

The second was that Claude's models and Claude Code itself have gotten noticeably stronger over the same period. The thing that crossed the threshold for me was /ultraplan: a recent Claude Code feature that routes a planning task to a dedicated cloud session running Opus 4.6 in Anthropic's Cloud Container Runtime. The session reads the repository for up to half an hour and produces a structured plan I review in a browser before any code is written. For a frontend migration — where I want the model to deeply understand the existing component tree, state management, and styling conventions before it changes anything — that is exactly the right shape of feature. I've used it twice this week and have not yet missed Antigravity's equivalent.

There's a secondary appeal that's harder to quantify: continuity. The backend already had a memory-file system, a dated decisions log, an issues-resolved log, and a CLAUDE.md that encoded our engineering standards. Beginning to apply the same discipline to the frontend repo — under the same coding agent — has felt structurally satisfying. Whether that's a durable improvement or an early-honeymoon feeling, I don't know yet. I've only been at this for days.

What I'd say to someone choosing now

With the caveat that my data on Claude Code in the frontend is measured in days and not months, here are the questions I'd push someone to think about before deciding:

  • Where does your stack live? Google ecosystem favors Antigravity by default. Heterogeneous or Anthropic-aligned stacks favor Claude Code. This is the single most predictive question and the easiest to answer.
  • How comfortable is your team with a terminal-resident agent? Claude Code lives in the terminal alongside your existing editor. Antigravity wants to be more of a primary surface. Neither is better in the abstract; one will fit your team's habits and the other will fight them.
  • How long-lived is the codebase?Claude Code's decisions-log + project-status discipline compounds over time. For projects measured in years, this matters a lot. For prototypes and one-shot work, less so.
  • Do you need MCP integration? Both tools have it, and my experience using both shows neither is a barrier. But the details differ enough — stdio vs. HTTP, configuration patterns, the ergonomics of connecting third-party MCP servers — that you should try the specific thing you need before committing.
  • Do you care about cloud-scale planning? Ultraplan is a real differentiator for big, multi-file work. If your team does a lot of refactors or migrations across hundreds of files, that single feature can change the math. If your day-to-day is individual file edits, it won't move you.

The strongest recommendation I have is the unsexy one: try both, on real work, for a week each.Anyone who tells you definitively which is better — including me — is extrapolating from a sample size you shouldn't trust. The structural differences (workflow philosophy, ecosystem alignment, terminal-resident vs. primary-surface posture) are durable. The feature deltas will close in both directions within months.

Caveats and limitations

I'm one developer on one project. I'm the sole contributor to a private codebase, which is exactly the situation where agentic tools shine the brightest; multi-developer teams have collaboration patterns I haven't had to think about. I've been in this industry for thirty years and have strong opinions about how production software should be structured — readers without that background may experience either tool differently.

Both products are also moving fast. Anything specific I've said about either may be obsolete within months. The structural differences — workflow philosophy, ecosystem fit, agent-operation model — are more durable than the feature deltas.

Finally: I am writing this on a site that itself is built with Claude Code. The site includes an agent that knows about my background, using the same patterns I ship at INDATA. The source is on GitHub. If anything in this essay sounds biased, that's the most useful place to look — at the actual code I've written with each tool, in production, with my name on it.


Comments, corrections, or counter-experience welcome — jarlnelson@outlook.com.