# py-spy: Sampling Profiler for Python


py-spy is a sampling profiler for Python programs. It visualizes where a Python process spends its time without restarting the process or modifying its code. py-spy runs in a separate process from the program it profiles and reads stack information out of the target's memory, which keeps overhead low enough that the project documents it as safe to use against production workloads.

## When to Use py-spy

Use py-spy when a Python process is already running and needs to be profiled in place: a web worker under load, a data-pipeline job mid-run, a stuck process that needs a stack dump, or any CPU-bound script whose profile should not be distorted by per-function tracing overhead. py-spy is a practical default for production CPU profiling because it requires no instrumentation and can be attached and detached at will. For line-by-line attribution inside a known hot function, reach for [line_profiler](https://github.com/pyutils/line_profiler); for memory rather than CPU, reach for [memray](https://github.com/bloomberg/memray).

## Key Features

- Out-of-process sampling: runs as a separate process and reads the target interpreter's stacks via OS-level process inspection. The profiled program needs no `import`, no decorator, and no restart.
- Three subcommands: `record` writes a profile to disk, `top` shows a live top-style view, and `dump` prints the current Python call stack for every thread (useful for diagnosing a hung process).
- Multiple output formats: `record` emits flame-graph SVG by default, [speedscope](https://www.speedscope.app/) JSON via `--format speedscope`, or a raw format that can be post-processed. Speedscope output also opens in the [Perfetto UI](https://ui.perfetto.dev/) as a Chrome trace.
- Native extension profiling: the `--native` flag captures C, C++, and Cython frames alongside Python frames. Native support covers Linux (x86_64, ARM, 32-bit ARM) and Windows (x86_64); the project does not claim native support on macOS.
- Subprocess support: `--subprocesses` follows `multiprocessing`, `subprocess.Popen`, and forked workers so a process tree can be profiled as one.
- [GIL](https://pydevtools.com/handbook/explanation/what-is-the-gil.md) and idle filtering: `--gil` restricts samples to frames that hold the GIL; `--idle` includes threads that are waiting on I/O. Samples can be filtered to the kind of work being investigated.
- Broad interpreter coverage: supports CPython 2.3 through 2.7 and 3.3 through 3.13 (per the project README as of v0.4.1).

## Installation

The simplest install uses [uv](https://pydevtools.com/handbook/reference/uv.md) to place py-spy on `PATH` as a standalone tool:

```bash
uv tool install py-spy
```

Other options documented by the project:

```bash
pip install py-spy              # from PyPI
brew install py-spy             # macOS, via Homebrew
cargo install py-spy            # builds from source; needs libunwind-dev on Linux
```

The project also ships in the Arch User Repository (`yay -S py-spy`) and the Alpine edge testing repo.

## Platform and Permission Notes

py-spy runs on Linux, macOS, Windows, and FreeBSD. Attaching to a running process requires OS-level permissions to read another process's memory, which has several practical consequences:

- macOS: the README states that attaching on macOS always requires running as root. The system Python at `/usr/bin/python` cannot be profiled because of System Integrity Protection; a non-system Python (installed via [uv](https://pydevtools.com/handbook/reference/uv.md), Homebrew, or pyenv) is required.
- Linux: profiling a process that py-spy launched itself (`py-spy record -- python script.py`) works without root. Attaching to an existing process by PID usually requires `sudo` unless the `kernel.yama.ptrace_scope` sysctl has been relaxed.
- Docker: the default seccomp profile blocks `process_vm_readv`. Run the container with `--cap-add SYS_PTRACE` to allow py-spy to attach.
- Kubernetes: add `SYS_PTRACE` to the pod's `securityContext.capabilities.add` list.

## Output Formats

| Format        | Flag                      | Viewer                                       |
| ------------- | ------------------------- | -------------------------------------------- |
| Flame graph   | default, or `--format flamegraph` | Any browser (SVG)                    |
| Speedscope    | `--format speedscope`     | [speedscope.app](https://www.speedscope.app/) or [Perfetto UI](https://ui.perfetto.dev/) |
| Raw           | `--format raw`            | Custom tooling                               |

## Pros

- Requires no code changes and no process restart, which makes it practical for production debugging.
- Runs out of process, so the sampler's own overhead is decoupled from the profiled program's event loop or GIL.
- Captures native extension frames on Linux and Windows when built with symbols.
- `py-spy dump` prints live stacks, which recovers useful state from a hung or looping process where a full profile is overkill.

## Cons

- Sampling profilers cannot attribute exact per-call costs the way a deterministic profiler like [cProfile](https://docs.python.org/3/library/profile.html) can. Very short functions may be under-sampled.
- Attaching to an existing PID needs elevated privileges on every supported OS; sandboxed environments (Docker, Kubernetes, restricted containers) need explicit capability grants.
- macOS does not support native-extension profiling per the project's platform matrix.
- The project is actively maintained but has long gaps between releases; v0.4.1 (July 2025) followed v0.4.0 (November 2024), which itself followed v0.3.14 (September 2022).


## Learn More

- [py-spy GitHub repository](https://github.com/benfred/py-spy)
- [py-spy on PyPI](https://pypi.org/project/py-spy/)
- [Speedscope flame-graph viewer](https://www.speedscope.app/)
- [Perfetto UI for Chrome traces](https://ui.perfetto.dev/)
