Getting Started with Python Using Codex
Codex CLI is OpenAI’s terminal-based coding agent. It reads your repository, edits files, runs commands, and iterates on errors, all from your terminal. Describe what you want in plain English, and Codex decides which files to read, which commands to run, and what code to write.
What separates Codex from other terminal agents is how it sandboxes every command the model runs. Codex isolates execution with OS-level enforcement (Seatbelt on macOS, Landlock and seccomp on Linux), not just application-layer checks. This means you can grant Codex more autonomy without worrying that a bad command will reach beyond your project directory or make network calls you didn’t authorize.
This guide covers how to use Codex for Python development specifically. Each section is self-contained, so skip to whatever is relevant.
Why Codex for Python development
Codex handles the gap between “I want to build something” and “I have a working project.” Tell it what you want, and it creates the project structure, installs packages, writes code, and runs it. When something fails, it reads the traceback and proposes a fix.
Two features shape how Codex fits into Python work:
Graduated autonomy. Python projects involve frequent iteration: install a package, run a script, read the error, adjust. Codex’s three autonomy modes let you match the level of oversight to the task. Use suggest mode when exploring unfamiliar code. Switch to full-auto mode for repetitive tasks like adding tests across a module.
Scripted automation with codex exec. Unlike interactive-only tools, Codex can run non-interactively. Pipe a prompt through codex exec and capture the result in stdout. This makes Codex useful in CI pipelines, pre-commit hooks, and batch scripts.
Installation
Install Codex CLI with npm:
npm install -g @openai/codexOn macOS, Homebrew is also available:
brew install --cask codexUpgrade to the latest version with:
npm install -g @openai/codex@latestOn Linux, the sandbox requires bubblewrap. Install it before running Codex:
sudo apt install bubblewrap # Ubuntu/DebianImportant
Change to your project directory before launching. Codex reads the files in your current working directory to build context. Launching from your home directory means it will not know about your project.
cd my-project
codexTo end a session, type /exit or press Ctrl+C.
Authentication and access
The first time you run codex, it walks you through authentication. Two options:
- ChatGPT account sign-in. If you have a ChatGPT Plus, Pro, Business, Edu, or Enterprise subscription, sign in with your ChatGPT account. Codex generates and configures the API key automatically.
- OpenAI API key. Set
OPENAI_API_KEYas an environment variable or enter it when prompted. This option bills per token at standard API rates.
Credentials are stored in ~/.codex/auth.json by default. You can switch to your OS keychain (macOS Keychain, Windows Credential Manager, Linux Secret Service) by setting auth_storage = "keyring" in ~/.codex/config.toml.
Costs and usage limits
Codex is bundled with ChatGPT subscriptions, not sold separately.
| Plan | Monthly cost | GPT-5.4 local messages (per 5-hour window) |
|---|---|---|
| Plus | $20 | 20-100 |
| Pro (5x) | $100 | 100-500 |
| Pro (20x) | $200 | 400-2,000 |
| Business | Pay-as-you-go | Credit-based |
Usage limits run on a rolling 5-hour window. The window starts with your first message and resets 5 hours later. If you hit the limit, switch to a lighter model with /model or wait for the window to reset.
API key users pay per token. GPT-5.4 costs $1.25 per million input tokens and $10.00 per million output tokens. The lighter codex-mini-latest model costs less for simpler tasks.
Autonomy modes
Codex provides three levels of autonomy, each combining a sandbox policy with an approval policy. This is the feature that most affects your daily workflow.
Suggest mode (default)
Codex proposes every action and waits for approval. File edits, new files, shell commands: nothing happens until you confirm. This is the right mode when you’re exploring unfamiliar code or working in a repository where mistakes are expensive.
Auto-edit mode
Codex creates and modifies files without asking, but still prompts before running shell commands. This is a good default for active development where you trust the file edits but want to review commands like uv add or uv run pytest before they execute.
Full-auto mode
Codex runs everything without prompting: file edits, shell commands, the full loop. Commands execute inside an OS-level sandbox that blocks network access and restricts filesystem writes to your project directory. Use this for well-defined tasks where you want Codex to iterate without interruption.
Switch modes at launch or mid-session:
# Launch in full-auto mode
codex --full-auto "Add type hints to all functions in src/"
# Switch mid-session
/mode full-autoWarning
Full-auto mode still runs commands on your machine, just inside a sandbox. Review what Codex produced after it finishes. The sandbox prevents damage outside your project, but Codex can still overwrite files within it.
How the sandbox protects your system
Codex’s sandbox is not a container or a VM. It uses the same kernel-level enforcement that your operating system uses to isolate applications:
- macOS: Seatbelt (the same framework that sandboxes App Store apps)
- Linux/WSL2: Landlock, seccomp, and bubblewrap
- Windows: Codex’s native Windows sandbox, or the Linux sandbox when running inside WSL2
In the default workspace-write mode, the sandbox allows Codex to read your entire filesystem but only write within your project directory. Network access is blocked. The read-only mode further restricts writes, and danger-full-access removes all restrictions (use this only when you understand the risk).
You can fine-tune sandbox behavior in ~/.codex/config.toml:
[sandbox]
sandbox_mode = "workspace-write"Setting up Python with Codex
Python does not need to be installed before starting. Ask Codex to handle it:
Install uv and create a new Python project called my-appCodex will install uv, which handles three things that trip up new Python developers:
Python versions. Python releases new versions regularly, and different projects may need different versions. uv downloads and manages Python interpreters so you don’t need to install Python from a website or wrestle with system package managers.
Virtual environments. A virtual environment keeps each project’s packages separate. Without one, installing a package for Project A can break Project B. When working with uv projects, commands like uv run and uv sync create and manage virtual environments automatically.
Packages. Python’s strength is its library ecosystem. uv installs packages faster than pip and tracks what your project depends on in a structured file. For a deeper look, see the Complete Guide to uv.
Tip
Activate your virtual environment before launching Codex so it doesn’t spend tokens probing what to activate. If you use uv, running uv sync first handles this. Export any environment variables you expect to use ahead of time.
How to give good instructions
Concrete instructions produce better results than abstract ones.
Instead of “Make this code better,” try “This function is slow when the input list has more than 10,000 items. Find a faster approach.”
Instead of “Build me a web app,” try “Create a FastAPI app with one endpoint, /greet, that takes a name query parameter and returns a JSON greeting.”
- Include the goal alongside the task. “I’m building a CLI that processes CSV files” gives Codex context to make better decisions than “write a function that reads a CSV.”
- Mention constraints. “Use only the standard library” or “this needs to run on Python 3.10” prevents wasted iteration.
- Describe what’s wrong instead of starting over. If Codex produces output that’s close but not right, name the gap: “The output format is correct but the dates should be ISO 8601.”
The build-run-fix cycle
Codex automates the write-run-fix loop. It writes code, executes it, reads the traceback when something fails, edits the code, and reruns. You watch the cycle happen in your terminal.
In suggest mode, you approve each step. In full-auto mode, Codex runs the entire cycle without stopping. The right mode depends on how much you trust the task to converge.
When to let it iterate:
- The errors are getting shorter or changing (progress is happening).
- It’s working through import errors, typos, or missing dependencies.
When to step in:
- It’s trying the same fix repeatedly.
- The approach seems wrong, not just the details. Say “stop” and redirect: “Try a different approach. Use X library instead.”
AGENTS.md: standing instructions
An AGENTS.md file in your project root gives Codex standing instructions it reads at the start of every session. This is the Codex equivalent of Claude Code’s CLAUDE.md.
An example for a Python project:
# Project conventions
- Python >= 3.11
- Use uv for all package management (never pip)
- Use src/ layout (uv init --package)
- Run tests with: `uv run pytest`
- Run linting with: `uv run ruff check .`
- Format code with: `uv run ruff format .`
- Type check with: `uv run ty check src/`
## Code style
- Prefer httpx over requests for HTTP calls
- Use pathlib for file paths, not os.pathCodex builds an instruction chain in layers: first it checks ~/.codex/ for AGENTS.override.md or AGENTS.md (global scope). Then, from the project root down to your current directory, it checks each directory for AGENTS.override.md, then AGENTS.md, then any configured fallback filenames. Later files override earlier ones, so directory-specific instructions take precedence over project-wide ones.
If you work with both Codex and Claude Code, configure Codex to read CLAUDE.md as a fallback:
project_doc_fallback_filenames = ["CLAUDE.md"]Keep AGENTS.md under 150 lines and factual. Wrap commands in backticks so Codex can copy-paste them directly. Commit it to version control so every collaborator gets the same behavior.
For a comprehensive set of project scaffolding instructions you can paste into AGENTS.md or into a Codex session, see the Modern Python Project Setup Guide for AI Assistants.
Working with packages
When you need to make HTTP requests, parse dates, build a web server, or analyze data, there’s almost certainly a Python package for it.
Package names are not required upfront. Describe what you need:
I need to make HTTP requests to a REST API. Add the right package.Codex will choose an appropriate library (like httpx or requests), install it with uv add (after you approve in suggest mode), and use it in the code it writes. The dependency gets recorded in pyproject.toml so it’s tracked with the project.
Testing
Automated tests verify your code does what you expect. They run fast and catch mistakes before users do.
Ask Codex to write tests alongside your code:
Write tests for the parse_csv function in src/my_app/parser.pyOr ask it to add tests after the fact:
Add tests for the existing code in src/my_app/Codex will create test files, write test cases, and run them with pytest. A passing test suite looks like:
===== 5 passed in 0.12s =====A failing test shows exactly what went wrong: what the code produced versus what was expected. Codex reads this output and can fix the code or the test depending on which is wrong.
A productive pattern is test-driven development: ask Codex to write a failing test for the behavior you want, then ask it to make the test pass:
Write a test that verifies parse_csv returns an empty list for an empty file. Then make the test pass.For a guided introduction, see the tutorial Setting up testing with pytest and uv.
Linting and formatting
A linter scans code for common mistakes, style violations, and potential bugs. A formatter rewrites code to follow consistent style rules. Ruff handles both.
Add ruff to this project and configure it for linting and formattingOnce configured:
Run ruff on the project and fix any issuesCodex also supports the /review command, which analyzes diffs against a base branch and surfaces findings without modifying your working tree. This is useful for reviewing your own changes before committing.
For a full walkthrough, see the tutorial Set up Ruff for formatting and checking your code or the Complete Guide to Ruff.
Git integration
Codex understands git. It can read diffs, stage files, write commit messages, and create pull requests:
Commit these changes with a descriptive messageShow me what changed since the last commitIn suggest mode, Codex prompts before running any git command that modifies history or pushes to a remote. In full-auto mode, git commands run without prompting but are still scoped by the sandbox (network access is blocked, so pushes will fail unless you switch to danger-full-access).
Scripting with codex exec
The codex exec subcommand runs Codex without the interactive UI. Output goes to stdout, making it useful in scripts, CI pipelines, and automation:
codex exec "List all Python files that import requests but don't handle timeouts"Pipe longer prompts through stdin:
cat <<'EOF' | codex exec -
Read all test files in tests/.
Check that every public function in src/my_app/ has at least one test.
List any untested functions.
EOFWhere interactive sessions are good for exploration, codex exec turns Codex into a scriptable tool you can embed in existing workflows.
Slash commands and navigation
Codex has built-in commands that start with /. The most useful for daily work:
/helpshows all available commands./modelswitches between available models (GPT-5.4, GPT-5.3-Codex, codex-mini)./modeswitches autonomy modes mid-session./reviewruns a code review against a branch or uncommitted changes./themepreviews and saves terminal color schemes./clearresets the conversation and starts fresh./exitends the session.
Other navigation shortcuts:
- Type
@in the composer to fuzzy-search files and insert paths into your prompt. - Prefix input with
!to run a shell command directly (output becomes part of the conversation context). - Press Ctrl+G to open your
$EDITORfor drafting longer prompts.
Context window and session management
Every model has a finite context window, and longer sessions fill it up. When earlier details get summarized away, quality can degrade.
Watch for these signs:
- Codex re-reads files it already examined.
- It suggests changes you already rejected.
- Responses become less relevant to your current task.
When this happens, start a fresh session with /clear or open a new terminal. Short, focused sessions produce better results than marathon ones.
Codex supports session resumption with codex resume, which restores transcript history, plan context, and prior approvals. This lets you pick up where you left off after a break without repeating context.
Choose a model
Use /model to switch models mid-session, or pass --model at launch.
| Model | Best for |
|---|---|
| GPT-5.4 | Default. Strong reasoning, best code quality. |
| GPT-5.3-Codex | Fast tasks where speed matters more than depth. |
| codex-mini-latest | Cost-sensitive work and simpler tasks. |
For most Python development, GPT-5.4 is the right default. Switch to a lighter model for tasks like renaming variables, formatting, or generating boilerplate.
When Codex gets it wrong
Codex will write broken code, choose the wrong library, overcomplicate a solution, or misunderstand what you asked. This is expected.
Common mistakes to watch for:
- Using
pip installinstead ofuv add. Without anAGENTS.mdthat says otherwise, Codex may default to pip. Add the instruction to yourAGENTS.md. - Adding unnecessary dependencies. Codex sometimes pulls in a library for something the standard library handles. If you see an unfamiliar package, ask why it’s needed.
- Hallucinating library APIs. Codex may call functions or use parameters that don’t exist in the library version you’re using. If code fails with
AttributeErrororTypeErroron a library call, this is likely the cause. - Overengineering. Vague prompts lead to complex solutions. Be specific about what you want, and say “simplify this” when the result is more than the task requires.
How to redirect:
- “Simplify this.” If the solution is more complex than it needs to be, say so directly.
- “Try without that library.” If it pulled in a dependency you don’t want, tell it to use the standard library or a specific alternative.
- “That’s not what I meant. I want X, not Y.” Restate the requirement. Being specific about the gap between what you got and what you wanted is faster than starting over.
In suggest mode, review the diff before approving changes. In full-auto mode, review the results after Codex finishes. Your judgment is part of the process.
Learn more
Handbook guides:
- Create your first Python project
- How to create a new Python project with Codex
- Modern Python Project Setup Guide for AI Assistants
- Complete Guide to uv
- Complete Guide to Ruff
- Complete Guide to Claude Code
- Enough Git to Supervise Your AI Coding Agent
Official documentation: