Skip to content

How to configure Claude Code to run your pytest suite

Claude Code runs your pytest suite in a raw terminal, and by default pytest answers with a full traceback for every failure. One broken assertion fills a screen; ten failures bury the line that matters. There is no Testing sidebar to click, so the agent reads whatever pytest prints and acts on it, noise included.

Wire pytest into the edit cycle instead. A /test slash command runs the suite on demand, a Stop hook gates the end of every turn, and a few output flags strip the traceback down to the assertion and file location Claude needs to act.

This guide assumes you already have a pytest suite. If you don’t, start with setting up testing with pytest and uv and come back to wire it into Claude Code.

Prerequisites

  • Claude Code installed
  • uv installed (the commands run pytest through uv run so it resolves to the project’s virtual environment)
  • A Python project with pytest as a dev dependency (uv add --dev pytest) and tests that pass today

Run tests on demand with a /test slash command

A slash command turns a paragraph of testing instructions into one word. Create .claude/commands/test.md in your project:

.claude/commands/test.md
---
description: Run the pytest suite and fix any failures
argument-hint: "[path or -k expression]"
---

Run `uv run pytest --tb=short -q $ARGUMENTS`.

If a test fails, read the traceback, fix the cause in the source, and never
edit a test to make it pass. Rerun `uv run pytest --last-failed -q --tb=short`
until the suite is green.

The file name becomes the command, so test.md gives you /test. $ARGUMENTS expands to whatever you type after the command, so /test runs the whole suite while /test tests/test_api.py or /test -k login scopes the run. Claude Code merged custom commands into skills, but files in .claude/commands/ still work and are the shortest way to define one.

Commit the file so the command travels with the repository and every collaborator gets the same /test.

Tip

To hand Claude the output without waiting for it to run pytest itself, use dynamic context injection. A line beginning !`uv run pytest --tb=short -q` runs before Claude reads the command and inlines the result, so the prompt arrives with the current failures already attached. Add allowed-tools: Bash(uv run pytest:*) to the command’s frontmatter so the run does not prompt for permission.

Gate every turn with a Stop hook

The slash command is the manual loop; a Stop hook is the automatic one. A Stop hook fires when Claude Code thinks it has finished a turn. If the hook exits 2, Claude Code treats it as a blocking error, reads the message on stderr, and keeps working. That is where pytest belongs: the agent cannot call a turn done while a test is red.

Register the hook in .claude/settings.json:

.claude/settings.json
{
  "hooks": {
    "Stop": [
      {
        "hooks": [
          {
            "type": "command",
            "command": "uv run pytest -q --tb=short >&2 || exit 2",
            "timeout": 120
          }
        ]
      }
    ]
  }
}

pytest exits non-zero when a test fails, the || exit 2 converts that into the blocking code Claude Code reads, and >&2 sends the report to stderr where the agent picks it up. When Claude leaves a test failing, the turn ends like this and the next turn has the failure in hand:

.F                                                                       [100%]
=================================== FAILURES ===================================
______________________________ test_add_negative _______________________________
tests/test_math.py:10: in test_add_negative
    assert add(-1, -1) == -3
E   assert -2 == -3
E    +  where -2 = add(-1, -1)
=========================== short test summary info ============================
FAILED tests/test_math.py::test_add_negative - assert -2 == -3
1 failed, 1 passed in 0.03s

Warning

The Stop hook runs the whole suite on every turn, so its cost is your suite’s runtime times the number of turns. A fast unit suite stays imperceptible. A suite with database, network, or integration tests drags every turn the agent takes. Scope the hook to fast tests with a marker (uv run pytest -q --tb=short -m "not slow"), and keep the full run in the /test command and CI.

Shape pytest output so Claude fixes the failure, not the noise

The flags in both the command and the hook control what Claude reads. pytest prints the full traceback by default, so the flags below shape it down to the line Claude can act on.

--tb=short prints one frame per failure plus the assertion, instead of the default full traceback that walks every call frame. -q drops the per-test progress dots and the passing-test lines, leaving the summary. Together they collapse a screen of output into the block above: the file, the line, and the actual-versus-expected values. That is the part Claude edits against.

--last-failed is the other half. On the turn after a failure, uv run pytest --last-failed -q --tb=short reruns only the tests that failed, so the agent iterates on the broken test instead of re-reading a green suite each time. Add -x when you want the agent to stop at the first failure and fix one thing before moving on.

Two cases to watch on the Stop hook:

  • Pre-existing failures block every turn. The hook exits 2 on any red test, not only the ones Claude introduced. On a suite that is already failing, the agent burns turns chasing tests it never touched. Fix the suite first, or scope the hook to the area Claude is editing (uv run pytest -q --tb=short tests/newmodule/).
  • An empty selection looks like a failure. pytest exits 5 when it collects no tests, which the || exit 2 turns into a block. If you scope the hook to a path, point it at one that actually holds tests.

Point Claude at the right command in CLAUDE.md

A CLAUDE.md note tells Claude which pytest command to reach for mid-turn instead of inventing an invocation:

CLAUDE.md
## Tests

Run the suite with `uv run pytest --tb=short -q`. Rerun only failures with
`uv run pytest --last-failed`. Fix the cause in the source; never edit a
test to make it pass. A clean run means the change is ready.

Claude Code treats CLAUDE.md as guidance it can skip, so the note documents the command and the Stop hook guarantees it runs.

Keep the full suite in CI

The Stop hook only fires while Claude Code is editing. It does not run when a teammate commits from another editor, and it does not catch the turn where Claude skips it. Keep uv run pytest in CI as the backstop so every commit is gated, not only the ones the agent makes. See how to run tests using uv for the CI invocation.

Learn More

Last updated on