AI Beta Tester

Personality-driven agents that break your app the way real users do.

ClaudePlaywrightFastAPI

Beta — working, Docker-only

The Problem

You built it, so you know how it's supposed to work. That's exactly what makes you blind to it. Real users don't follow the happy path — they tab-key through forms in the wrong order, paste URLs into search boxes, click things twice, and abandon flows the moment anything feels off. Standard automated tests verify that the code works. They don't verify that the experience makes sense to someone who isn't you.

The Build

A set of distinct agent personalities — Speedrunner, Chaos Gremlin, Methodical Newcomer, Technical Exploiter, Privacy Paranoid, and more — each with a behavioral profile that shapes how they interact with a target URL via Playwright MCP. Agents run against your app, surface findings by category (UX friction, edge cases, broken flows), and produce structured Markdown reports with reproduction steps. A Next.js dashboard shows live session progress via SSE, session history, and a report browser. The backend runs as a FastAPI service; agents use Claude to reason through what they're seeing and decide what to try next.

What Makes It Different

The value isn't automation — it's the behavioral diversity. A Speedrunner skips instructions and rage-clicks. A Methodical Newcomer reads everything and still gets lost. A Chaos Gremlin submits empty forms and pastes emojis into number fields. Each personality catches a different class of bug. Running all of them against the same URL in a single session surfaces the full spectrum of failure modes before a real user finds them first.