I just did an upgrade of my Astro-based blog to the latest-Astro-that-makes-sense-right-now which turns out to be 5.16.5. Having just finished a deep dive into Antigravity and ingested some Kool-Aid about agent managers and agent-ready integrated browsers, I decided to see if I could automate QA testing of my blog, and do a thorough rather than minimal QA on the 5.16.5 version before pushing it live.¹ I also wanted to have an “agent manager” orchestrate this, with the actual QA delegated to an agent (sub-agent).

I tried using Claude Code first, using Claude sub-agents and the browser-tools \ browser-automation skill. Across all three Claude Code surfaces (terminal, VS Code extension, Claude Code Desktop), this failed. No doubt there was something that could have been configured correctly somewhere, but still, with at least a modest effort on my part, I couldn’t get it working.

So I decided to aim Antigravity at the project and see what it could do. Far, far better, as it turns out. My request was:

I want you to use the browser tool to QA this website.
A preview is running locally at http://localhost:4322/

Test at least 30 posts, old and new, and all functionality,
intra-page and site wide (eg search)

Agent Manager dug right in, starting with a plan, cranking up Antigravity’s integrated Chrome agent-controlled browser, working its way methodically through the local preview site as requested, and finishing with a nice QA report (see below).

I was a little perplexed that Agent Manager appeared to be running the QA process “on the main thread” so to speak—that is, the agent window I was in was “busy” running the task. But I think this is a PEBKAC problem, also known as my own lack of understanding. Agent Manager allows multiple active conversations, meaning I could start up other agent tasks while this was running. Agent Manager also has an Inbox, so I would get notifications from any of those agent windows. I’m sure this will become clearer as I work more with Antigravity.

In any case: A+ on this agent-driven QA process!

QA Results

Website QA Results

Date: 2025-12-13 20:27

Executive Summary

Result: PASS Tested 30 unique posts across Recent, Middle, and Oldest sections of the Archive.

Functionality: Search, Theme Toggle, Navigation, and Footer links are all functional.
Content: All 30 tested posts loaded with correct titles and content. No 404s or layout breaks observed.
Navigation: “Back to Archive” and Home flows work seamlessly.

Site-Wide Functionality

Feature	Status	Notes
Search	PASS	Searched “Git”, found results, opened post successfully.
Theme Toggle	PASS	Verified background color change (light/dark).
Header Navigation	PASS	Archive and Home links work.
Footer Links	PASS	Subscribe and RSS buttons visible.
RSS/Subscribe	PASS	Elements present.

Post Verification Log

Target: 30 Posts total (Mix of New and Old)

Post Title	URL	Status	Layout Issues	Broken Links
Kevin Hou on Google DeepMind’s Antigravity	/p/kevin-hou-antigravity	PASS	None	None
Really Sam? Code Red? We Have Questions.	/p/sam-altman-code-red	PASS	None	None
AI Vision vs Computer Vision: Are the Curves Crossing?	/p/ai-vision-vs-ml-ocr-are-the-curves-crossing	PASS	None	None
Anthropic: Stop Building Agents, Build Skills Instead	/p/anthropic-stop-building-agents-start-building-skills	PASS	None	None
Talk Like Ethan Mollick, Matey	/p/talk-like-ethan-mollick	PASS	None	None
Using Multimodal AI to Capture and Enrich Heirloom Recipes	/p/multimodal-recipes	PASS	None	None
Ethan Mollick: “How to Use AI to Do Stuff”	/p/ethan-mollick-ai-stuff	PASS	None	None
OpenAI Concedes: AI Can’t Detect AI	/p/openai-ai-text-detection	PASS	None	None
A Day to Remember	/p/a-day-to-remember	PASS	None	None
Shape Up Your Agile	/p/shape-up-your-agile	PASS	None	None
Building with Shape Up	/p/building-with-shape-up	PASS	None	None
Why Fractional?	/p/why-fractional	PASS	None	None
Blog/Website Relaunch: Under the Hood of craftycto.com	/p/relaunch	PASS	None	None
Chroma: RAG is Dead; Long Live Context Engineering	/p/chroma-context-engineering	PASS	None	None
Great Rules In, Great Results Out	/p/great-rules-great-results	PASS	None	None
Claude Code Deep Dive: Hugo to Astro+Beehiiv in 9 Days	/p/hugo-to-astro-migration-story	PASS	None	None
Andrew Ng on Leaders Still Doing Things the Way They Were in 2022	/p/andrew-ng-2022	PASS	None	None
Two Interviews with OpenAI President Greg Brockman	/p/two-from-greg-brockman	PASS	None	None
ChatGPT Images Pictorial	/p/chatgpt-images-pictorial	PASS	None	None
AI Engineer World’s Fair 2025: My Day 1 Highlights	/p/aiewf2025-my-day-1-highlights	PASS	None	None
AI Writing Index, Feb ‘23 to Jun ‘24	/p/ai-writing-index	PASS	None	None
Apple Intelligence Initial Thoughts	/p/apple-intelligence-initial-thoughts	PASS	None	None
Observable Framework Delivers Blazing-fast Data Dashboards	/p/observable-framework-blazing-fast	PASS	None	None
A Quick Look at GPT-4o	/p/gpt-4o-quick-benchmark	PASS	None	None
What If You Don’t Need MCP At All?	/p/what-if-you-dont-need-mcp	PASS	None	None
Claude Skills in Claude Code: A Compleat Guide	/p/cursor-rules-to-claude-skills	PASS	None	None
These Comments from Vercel CTO Malte Ubl Struck Home	/p/vercel-cto-malte-ubl-quotes	PASS	None	None
Autonomous Coding Agents: Fire and Forget	/p/autonomous-coding-agents-fire-and-forget	PASS	None	None

I realize what I’m calling “thorough QA” here is weak sauce compared to real automated QA. ↩

Antigravity Gets an A+ on My Browser QA Task

QA Results

Website QA Results

Executive Summary

Site-Wide Functionality

Post Verification Log

Related Posts

Kevin Hou on Google DeepMind's Antigravity

Anthropic: Stop Building Agents, Build Skills Instead

Skillport: Claude Skill Sharing at Scale

OpenAI Quietly Adopts Skills

Website QA Results

Executive Summary

Site-Wide Functionality

Post Verification Log

Footnotes

Related Posts

Kevin Hou on Google DeepMind's Antigravity

Anthropic: Stop Building Agents, Build Skills Instead

Skillport: Claude Skill Sharing at Scale

OpenAI Quietly Adopts Skills