py-spy: Sampling Profiler for Python
py-spy is a sampling profiler for Python programs. It visualizes where a Python process spends its time without restarting the process or modifying its code. py-spy runs in a separate process from the program it profiles and reads stack information out of the target’s memory, which keeps overhead low enough that the project documents it as safe to use against production workloads.
When to Use py-spy
Use py-spy when a Python process is already running and needs to be profiled in place: a web worker under load, a data-pipeline job mid-run, a stuck process that needs a stack dump, or any CPU-bound script whose profile should not be distorted by per-function tracing overhead. py-spy is a practical default for production CPU profiling because it requires no instrumentation and can be attached and detached at will. For line-by-line attribution inside a known hot function, reach for line_profiler; for memory rather than CPU, reach for memray.
Key Features
- Out-of-process sampling: runs as a separate process and reads the target interpreter’s stacks via OS-level process inspection. The profiled program needs no
import, no decorator, and no restart. - Three subcommands:
recordwrites a profile to disk,topshows a live top-style view, anddumpprints the current Python call stack for every thread (useful for diagnosing a hung process). - Multiple output formats:
recordemits flame-graph SVG by default, speedscope JSON via--format speedscope, or a raw format that can be post-processed. Speedscope output also opens in the Perfetto UI as a Chrome trace. - Native extension profiling: the
--nativeflag captures C, C++, and Cython frames alongside Python frames. Native support covers Linux (x86_64, ARM, 32-bit ARM) and Windows (x86_64); the project does not claim native support on macOS. - Subprocess support:
--subprocessesfollowsmultiprocessing,subprocess.Popen, and forked workers so a process tree can be profiled as one. - GIL and idle filtering:
--gilrestricts samples to frames that hold the GIL;--idleincludes threads that are waiting on I/O. Samples can be filtered to the kind of work being investigated. - Broad interpreter coverage: supports CPython 2.3 through 2.7 and 3.3 through 3.13 (per the project README as of v0.4.1).
Installation
The simplest install uses uv to place py-spy on PATH as a standalone tool:
uv tool install py-spyOther options documented by the project:
pip install py-spy # from PyPI
brew install py-spy # macOS, via Homebrew
cargo install py-spy # builds from source; needs libunwind-dev on LinuxThe project also ships in the Arch User Repository (yay -S py-spy) and the Alpine edge testing repo.
Platform and Permission Notes
py-spy runs on Linux, macOS, Windows, and FreeBSD. Attaching to a running process requires OS-level permissions to read another process’s memory, which has several practical consequences:
- macOS: the README states that attaching on macOS always requires running as root. The system Python at
/usr/bin/pythoncannot be profiled because of System Integrity Protection; a non-system Python (installed via uv, Homebrew, or pyenv) is required. - Linux: profiling a process that py-spy launched itself (
py-spy record -- python script.py) works without root. Attaching to an existing process by PID usually requiressudounless thekernel.yama.ptrace_scopesysctl has been relaxed. - Docker: the default seccomp profile blocks
process_vm_readv. Run the container with--cap-add SYS_PTRACEto allow py-spy to attach. - Kubernetes: add
SYS_PTRACEto the pod’ssecurityContext.capabilities.addlist.
Output Formats
| Format | Flag | Viewer |
|---|---|---|
| Flame graph | default, or --format flamegraph |
Any browser (SVG) |
| Speedscope | --format speedscope |
speedscope.app or Perfetto UI |
| Raw | --format raw |
Custom tooling |
Pros
- Requires no code changes and no process restart, which makes it practical for production debugging.
- Runs out of process, so the sampler’s own overhead is decoupled from the profiled program’s event loop or GIL.
- Captures native extension frames on Linux and Windows when built with symbols.
py-spy dumpprints live stacks, which recovers useful state from a hung or looping process where a full profile is overkill.
Cons
- Sampling profilers cannot attribute exact per-call costs the way a deterministic profiler like cProfile can. Very short functions may be under-sampled.
- Attaching to an existing PID needs elevated privileges on every supported OS; sandboxed environments (Docker, Kubernetes, restricted containers) need explicit capability grants.
- macOS does not support native-extension profiling per the project’s platform matrix.
- The project is actively maintained but has long gaps between releases; v0.4.1 (July 2025) followed v0.4.0 (November 2024), which itself followed v0.3.14 (September 2022).