Seven Claude Code Skills for Python from the Creator of spaCy

February 16, 2026

Matthew Honnibal, co-founder of Explosion and creator of spaCy, has published a collection of Claude Code skills for Python development. It’s one of the more thoughtful skill collections I’ve seen.

Claude Code skills are markdown files that give Claude specialized instructions when invoked as slash commands. They live in ~/.claude/commands/ and run when you type /<skill-name>. They encode domain expertise, and the quality of the expertise matters.

The collection covers type annotations, docstrings, error handling, resilience analysis, property-based testing, mutation testing, and structural overviews. After testing all seven against real Python code, a few stood out.

The standout: pre-mortem

The pre-mortem skill writes fictional post-mortem reports for bugs that haven’t happened yet. By forcing the analysis into past tense (“What happened,” “Why it broke”), it produces complete causal chains instead of vague warnings.

It checks a 10-category fragility catalogue: implicit ordering dependencies, coincidental correctness, load-bearing defaults, invisible invariants, and others. Here’s one finding:

@property
def success_rate(self):
    return len(self.valid_rows) / (len(self.valid_rows) + len(self.errors))

The fictional post-mortem: after adding multi-field validation, a single invalid row produces multiple ValidationError objects. The denominator inflates, the numerator doesn’t, and the success rate exceeds 1.0. The formula assumes one error per row, but nothing enforces that.

The code works today. It breaks after a plausible refactor. That’s not what you get from asking Claude to “review this code.”

contract-docstrings

The contract-docstrings skill documents what a function demands, what it guarantees, and how it fails. Its key instruction: “document what is, not what should be.”

The best findings come from its “silenced errors” analysis. Given a function like this:

def save_to_disk(self):
    try:
        os.makedirs(self.cache_dir, exist_ok=True)
        path = os.path.join(self.cache_dir, "cache.json")
        with open(path, "w") as f:
            json.dump(self._store, f)
    except Exception:
        pass

The contract notes that all exceptions are silently discarded: PermissionError, TypeError from non-serializable values, OSError from a full disk. The caller cannot know if the save succeeded.

It also found a subtler problem in the corresponding load_from_disk: json.load can return any valid JSON type, so if the cache file contains [1, 2, 3], the load “succeeds” and silently sets the internal store to a list instead of a dict. Every subsequent method call breaks, far from the actual cause.

try-except

The try-except skill audits exception handling with a five-point checklist. The first question, “right mechanism?”, is the most useful. External state operations (filesystem, network) justify try/except. Local value checks should use conditionals.

For the same save_to_disk function, it splits the broad catch into targeted handlers:

def save_to_disk(self):
    path = os.path.join(self.cache_dir, "cache.json")
    try:
        os.makedirs(self.cache_dir, exist_ok=True)
    except OSError as e:
        logging.warning("Failed to create cache directory %s: %s", self.cache_dir, e)
        return
    try:
        with open(path, "w") as f:
            json.dump(self._store, f)
    except OSError as e:
        logging.warning("Failed to write cache file %s: %s", path, e)
    # Note: TypeError from json.dump is NOT caught -- it indicates
    # non-serializable data in the cache, which is a bug.

os.path.join, a pure string operation, moves out of the try block. The TypeError from json.dump stays uncaught: it’s a bug, not a recoverable error.

Zero false positives across all blocks examined.

tighten-types and hypothesis-tests

The tighten-types skill works through six categories of type improvements, ordered by impact. Given a class like:

class WeatherCache:
    def __init__(self, ttl=DEFAULT_TTL, cache_dir=CACHE_DIR):
        self.ttl = ttl
        self.cache_dir = cache_dir
        self._store = {}
        self._hits = 0

It adds class-level attribute annotations, parameterizes bare dict to dict[str, CacheEntry], and spots the repeated {data, timestamp, city} dict shape across six locations as a TypedDict candidate. The checklist ensures nothing gets skipped.

The hypothesis-tests skill generates property-based tests with the Hypothesis library. It builds input strategies that model each function’s valid inputs, then tests properties like roundtrip correctness and idempotence. The “encode constraints, don’t filter” directive (st.integers(min_value=1) instead of st.integers().filter(lambda x: x > 0)) shows real Hypothesis expertise.

What makes these work

If you write your own skills, these patterns are worth stealing:

Each skill provides a numbered checklist. LLMs do better with structured classification than with “tell me what’s wrong.”

Categories are ordered by impact. When an LLM loses focus midway through, the important work is already done.

Several include explicit “don’t” rules: “don’t invent invariants,” “avoid current bugs,” “what NOT to test.” Without these, Claude generates plausible but useless findings.

Several enforce a survey phase: read all the code before changing any of it.

A note on security

The skills use .md.txt rather than .md. GitHub renders .md as HTML, so a malicious skill could hide HTML payloads invisible to anyone browsing the repo. .md.txt shows raw text, making the full prompt auditable. Skills run with full filesystem and shell access, so this caution matters.

Getting started

To install a skill, copy it to ~/.claude/commands/ and rename it from .md.txt to .md:

curl -o ~/.claude/commands/pre-mortem.md \
  https://raw.githubusercontent.com/honnibal/claude-skills/main/pre-mortem.md.txt

Then invoke it in Claude Code with /pre-mortem src/mypackage/.

The full collection is at github.com/honnibal/claude-skills. If you only install one, make it pre-mortem.

Last updated on

Please submit corrections and feedback...