Skip to content

haydenfowler/harness-engineering-example

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

89 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Harness Engineering Example

Harness engineering is the discipline of designing infrastructure, constraints, and feedback loops around AI agents to make them reliable in production. This repo is a reference implementation: a minimal Next.js app paired with a full harness that runs two autonomous agents — an Engineer and a QA reviewer — that ship features end-to-end with no human in the loop.

Live app →

How it works

You (local)              GitHub                     AWS
───────────────          ──────────────────────     ────────────────
/plan → spec     →   GitHub Issue (agent-ready) →   Amplify (deploy)
                         ↓
                   engineer-agent.yml
                   (claude-code-action)
                         ↓
                   PR (review-requested)
                         ↓
                   qa-agent.yml
                   (claude-code-action)
                         ↓
                   Approve → Merge → main
                         ↓
                   deploy.yml → Amplify

Engineer agent — triggered by an agent-ready label on a GitHub Issue. Creates a branch, implements the feature using TDD, opens a PR, reacts to CI failures, and reacts to QA feedback until the PR is approved.

QA agent — triggered by a review-requested label on the PR. Reads the original acceptance criteria, tests the feature in a real browser via Chrome DevTools MCP, and either merges or requests changes with specific evidence.

Key design decisions

  • In-repo memory. GitHub Actions runners are ephemeral — ~/.claude/ doesn't persist between runs. Instead, ENGINEER_MEMORY.md and QA_MEMORY.md live in the repo and are committed after each task, giving agents accumulated institutional knowledge across jobs.

  • Role separation with hard constraints. The QA agent has no write access to the PR branch and cannot approve unless every acceptance criterion is met and all CI checks pass. This prevents the common failure mode of agents rubber-stamping each other's work.

  • Guard workflow. guard.yml auto-closes issues and PRs from unauthorised actors. Combined with branch protection (CI required, PR review required, no direct pushes to main), the harness controls what the agents can and can't do.

  • Swappable app. The Next.js app in app/ is the agents' target, not the point. You can replace it with any codebase — the harness is what this repo is demonstrating.

Repo layout

Path What it is
app/ Next.js sample application
.github/workflows/ Agent trigger workflows and CI/CD
ENGINEER.md Role instructions for the Engineer agent
QA.md Role instructions for the QA agent
ENGINEER_MEMORY.md Accumulated Engineer agent learnings
QA_MEMORY.md Accumulated QA agent learnings
docs/architecture.md Full system design
docs/specs/ Feature specs from planning sessions

Running the app locally

cd app
npm install
npm run dev   # http://localhost:3000

About

A reference implementation of harness engineering: infrastructure, constraints, and feedback loops for autonomous AI agents

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors