I Tried Clawpatch on Two Real Repositories

May 16, 2026 · 6 min read · ai-agentscode-reviewdevtoolsgolangrails

Today I tried Clawpatch on two repositories I actually care about.

One is a small Rails project. The other is a Go project with a little more history behind it. Both have the usual project sediment: old config, Makefiles, CI assumptions, and decisions that made sense at the time.

That is where developer tools become interesting.

A tool can look brilliant on its own README. The real question is: does it survive contact with a normal repository?

What Clawpatch does

Clawpatch is an automated code review CLI.

“AI reviews code” is almost meaningless now. Everyone claims that.

Clawpatch does something more concrete: it maps the repository into semantic units like commands, packages, services, config files, tests, and related context. Then it reviews those units and stores findings locally under .clawpatch/.

The workflow is simple:

clawpatch init
clawpatch map
clawpatch review --limit 3
clawpatch report

There is also a fix loop:

clawpatch fix --finding <id>
clawpatch revalidate --finding <id>

I like this shape.

It creates findings first. The fix command is explicit, scoped to one finding, and followed by revalidation. That is the right safety model for this category of tool.

First repo: Sharepass

I started with Sharepass, a small Rails project.

This was not impressive.

Clawpatch mapped exactly one feature:

Project config Makefile

That is it. It never got as far as controllers, models, routes, or tests.

So yes, Rails support is clearly not there yet, at least not for this repository. This is early-tool territory.

But even this limited run found two useful things.

First, a deploy bug:

# defined
tag=$(shell git rev-parse --short @)

# used later
deploy-staging:
	nomad run -var='version=${TAG}' deployment/staging.nomad.hcl

The Makefile defines lowercase tag, but staging deploy uses uppercase TAG. If nobody passes TAG from the outside, the deploy gets an empty version.

Classic Makefile nonsense. Easy to miss. Annoying when it happens.

Second, it found a hardcoded Rails SECRET_KEY_BASE in the Docker run target.

That one is not subtle. Secrets do not belong in a Makefile. They belong in the environment or a secret manager.

So Sharepass was a mixed result: bad app mapping, but still useful config review.

Second repo: dora-exporter

Then I tried dora-exporter, a Go exporter I wrote for DORA metrics.

This was much better.

Clawpatch mapped eight features:

Go command cmd
Project config Makefile
Project config go.mod
Go package catalog
Go package prometheus
Go package jira
Go package github
Go package config

Now we are talking.

It understood the project as a Go application: command, config, packages, service-ish pieces. Not perfect, but useful.

I reviewed everything. Clawpatch produced fourteen findings.

Some were just maintenance advice:

go.mod still says Go 1.18
the functional test target may run more tests than intended

Fine. Not useless, but not urgent either.

The good stuff was better.

Bugs I would actually fix

One finding was about startup failure.

If http.ListenAndServe fails — for example because the port is already in use — the program logs the error and returns normally. That means the process exits with status 0.

That is exactly the kind of production bug I hate.

The service did not start, but your supervisor may think it succeeded. Wonderful. Everything is fine, except the thing is not running.

Another finding was in config loading.

NewConfigFromFile calls Load on a nil *Config. Then Load eventually dereferences the receiver.

That means the constructor can panic for a perfectly readable config file.

This is not a lint issue. This is a bug.

It also noticed that invalid YAML is logged but still accepted. If YAML parsing fails, config loading continues with defaults and environment variables. So the exporter can start with half the intended config missing.

That is a bad failure mode. Broken config should fail loudly.

In the GitHub integration, Clawpatch found a few more real problems:

GitHub API transport errors can lead to nil response dereferences
empty PR commit responses can cause index-out-of-range panics
the webhook endpoint accepts deployment events without validating X-Hub-Signature-256

The webhook one matters.

If an endpoint is reachable, it should verify the GitHub signature before trusting the body. Otherwise anyone who can reach it can forge deployment events and mutate metrics.

Maybe your network boundary saves you. Maybe it does not. I prefer not to build systems where “maybe” is the security model.

In the Prometheus package, it found that Exporter.Update is a method on *Exporter, but uses a package-level global exporter instead of the receiver.

That can panic when the global is not set. Or mutate the wrong exporter in tests. Or just make the method lie about what it does.

Again, very real bug shape.

It also found that SaveMetricsToFile ignores the error from prometheus.WriteToTextfile and logs success anyway.

Full disk? Missing directory? Permission issue?

Apparently success.

No, thank you.

Not every finding deserves a patch

This is important.

Clawpatch is not an oracle.

Some findings are obvious bugs. Some are good suspicions. Some are project-policy questions dressed as bugs.

For example, “Go 1.18 is unsupported” is true. But whether to bump it depends on what compatibility you want to promise. The right fix may be a CI matrix, not just editing go.mod because a tool complained.

Same with the Makefile integration test target. Clawpatch is right that go test ./... -tags=integration still runs normal tests. Whether that is wrong depends on how the project is organized.

You still need a human.

Sorry, everyone trying to automate judgment away. Not today.

What I liked

The best part is that Clawpatch leaves state behind.

.clawpatch/
  config.json
  features/
  findings/
  reports/
  runs/

This is underrated.

A normal AI code review disappears into chat history. Maybe you copy a few bullets into an issue. Maybe you forget. Maybe the same finding appears again next week.

Clawpatch treats findings as project state.

That makes it possible to resume, triage, mark false positives, fix one thing, revalidate, and keep moving.

I also like the structure of the findings. They include evidence, confidence, recommendation, repro notes, minimum fix scope, and suggested regression tests.

The regression test suggestion is especially useful. Even when I disagree with the exact fix, the suggested test often points to the contract the code should have had in the first place.

That is where these tools can be genuinely helpful: not replacing thinking, but making the next thinking step obvious.

What needs work

Rails mapping needs work.

For Sharepass, Clawpatch barely got past the Makefile. That is not enough for a Rails project.

Go support looked much stronger. It understood packages, commands, and tests well enough to find useful issues.

I would also like better triage workflows over time. If this becomes part of normal development, you need a clean way to say:

this is real
this is false positive
this is accepted risk
this is good but not now
this needs a ticket

The raw findings are useful. The workflow around them is where the real leverage will be.

My take

Clawpatch is not “AI replaces code review”.

Good.

That idea is mostly bullshit.

The useful version is much more boring and much more valuable: an automated first-pass reviewer that understands enough of the repository to produce reviewable findings, preserve them as state, and help you fix one thing at a time.

On Sharepass, it showed its limits.

On dora-exporter, it found bugs I would actually fix.

That is enough for me to keep it in the toolbox.

I do not want agents that pretend to be senior engineers.

I want tools that leave evidence, respect the worktree, keep state, and make the next human action obvious.

Clawpatch is not finished, but it is pointing in the right direction.