Skip to content

Getting Started with Python Using Codex

Codex CLI is OpenAI’s terminal-based coding agent. It reads your repository, edits files, runs commands, and iterates on errors, all from your terminal. Describe what you want in plain English, and Codex decides which files to read, which commands to run, and what code to write.

What separates Codex from other terminal agents is how it sandboxes every command the model runs. Codex isolates execution with OS-level enforcement (Seatbelt on macOS, Landlock and seccomp on Linux), not just application-layer checks. This means you can grant Codex more autonomy without worrying that a bad command will reach beyond your project directory or make network calls you didn’t authorize.

This guide covers how to use Codex for Python development specifically. Each section is self-contained, so skip to whatever is relevant.

Why Codex for Python development

Codex handles the gap between “I want to build something” and “I have a working project.” Tell it what you want, and it creates the project structure, installs packages, writes code, and runs it. When something fails, it reads the traceback and proposes a fix.

Two features shape how Codex fits into Python work:

Graduated autonomy. Python projects involve frequent iteration: install a package, run a script, read the error, adjust. Codex’s three autonomy modes let you match the level of oversight to the task. Use suggest mode when exploring unfamiliar code. Switch to full-auto mode for repetitive tasks like adding tests across a module.

Scripted automation with codex exec. Unlike interactive-only tools, Codex can run non-interactively. Pipe a prompt through codex exec and capture the result in stdout. This makes Codex useful in CI pipelines, pre-commit hooks, and batch scripts.

Installation

Install Codex CLI with npm:

npm install -g @openai/codex

On macOS, Homebrew is also available:

brew install --cask codex

Upgrade to the latest version with:

npm install -g @openai/codex@latest

On Linux, the sandbox requires bubblewrap. Install it before running Codex:

sudo apt install bubblewrap   # Ubuntu/Debian

Important

Change to your project directory before launching. Codex reads the files in your current working directory to build context. Launching from your home directory means it will not know about your project.

cd my-project
codex

To end a session, type /exit or press Ctrl+C.

Authentication and access

The first time you run codex, it walks you through authentication. Two options:

  • ChatGPT account sign-in. If you have a ChatGPT Plus, Pro, Business, Edu, or Enterprise subscription, sign in with your ChatGPT account. Codex generates and configures the API key automatically.
  • OpenAI API key. Set OPENAI_API_KEY as an environment variable or enter it when prompted. This option bills per token at standard API rates.

Credentials are stored in ~/.codex/auth.json by default. You can switch to your OS keychain (macOS Keychain, Windows Credential Manager, Linux Secret Service) by setting auth_storage = "keyring" in ~/.codex/config.toml.

Costs and usage limits

Codex is bundled with ChatGPT subscriptions, not sold separately.

Plan Monthly cost GPT-5.4 local messages (per 5-hour window)
Plus $20 20-100
Pro (5x) $100 100-500
Pro (20x) $200 400-2,000
Business Pay-as-you-go Credit-based

Usage limits run on a rolling 5-hour window. The window starts with your first message and resets 5 hours later. If you hit the limit, switch to a lighter model with /model or wait for the window to reset.

API key users pay per token. GPT-5.4 costs $1.25 per million input tokens and $10.00 per million output tokens. The lighter codex-mini-latest model costs less for simpler tasks.

Autonomy modes

Codex provides three levels of autonomy, each combining a sandbox policy with an approval policy. This is the feature that most affects your daily workflow.

Suggest mode (default)

Codex proposes every action and waits for approval. File edits, new files, shell commands: nothing happens until you confirm. This is the right mode when you’re exploring unfamiliar code or working in a repository where mistakes are expensive.

Auto-edit mode

Codex creates and modifies files without asking, but still prompts before running shell commands. This is a good default for active development where you trust the file edits but want to review commands like uv add or uv run pytest before they execute.

Full-auto mode

Codex runs everything without prompting: file edits, shell commands, the full loop. Commands execute inside an OS-level sandbox that blocks network access and restricts filesystem writes to your project directory. Use this for well-defined tasks where you want Codex to iterate without interruption.

Switch modes at launch or mid-session:

# Launch in full-auto mode
codex --full-auto "Add type hints to all functions in src/"

# Switch mid-session
/mode full-auto

Warning

Full-auto mode still runs commands on your machine, just inside a sandbox. Review what Codex produced after it finishes. The sandbox prevents damage outside your project, but Codex can still overwrite files within it.

How the sandbox protects your system

Codex’s sandbox is not a container or a VM. It uses the same kernel-level enforcement that your operating system uses to isolate applications:

  • macOS: Seatbelt (the same framework that sandboxes App Store apps)
  • Linux/WSL2: Landlock, seccomp, and bubblewrap
  • Windows: Codex’s native Windows sandbox, or the Linux sandbox when running inside WSL2

In the default workspace-write mode, the sandbox allows Codex to read your entire filesystem but only write within your project directory. Network access is blocked. The read-only mode further restricts writes, and danger-full-access removes all restrictions (use this only when you understand the risk).

You can fine-tune sandbox behavior in ~/.codex/config.toml:

[sandbox]
sandbox_mode = "workspace-write"

Setting up Python with Codex

Python does not need to be installed before starting. Ask Codex to handle it:

Install uv and create a new Python project called my-app

Codex will install uv, which handles three things that trip up new Python developers:

Python versions. Python releases new versions regularly, and different projects may need different versions. uv downloads and manages Python interpreters so you don’t need to install Python from a website or wrestle with system package managers.

Virtual environments. A virtual environment keeps each project’s packages separate. Without one, installing a package for Project A can break Project B. When working with uv projects, commands like uv run and uv sync create and manage virtual environments automatically.

Packages. Python’s strength is its library ecosystem. uv installs packages faster than pip and tracks what your project depends on in a structured file. For a deeper look, see the Complete Guide to uv.

Tip

Activate your virtual environment before launching Codex so it doesn’t spend tokens probing what to activate. If you use uv, running uv sync first handles this. Export any environment variables you expect to use ahead of time.

How to give good instructions

Concrete instructions produce better results than abstract ones.

Instead of “Make this code better,” try “This function is slow when the input list has more than 10,000 items. Find a faster approach.”

Instead of “Build me a web app,” try “Create a FastAPI app with one endpoint, /greet, that takes a name query parameter and returns a JSON greeting.”

  • Include the goal alongside the task. “I’m building a CLI that processes CSV files” gives Codex context to make better decisions than “write a function that reads a CSV.”
  • Mention constraints. “Use only the standard library” or “this needs to run on Python 3.10” prevents wasted iteration.
  • Describe what’s wrong instead of starting over. If Codex produces output that’s close but not right, name the gap: “The output format is correct but the dates should be ISO 8601.”

The build-run-fix cycle

Codex automates the write-run-fix loop. It writes code, executes it, reads the traceback when something fails, edits the code, and reruns. You watch the cycle happen in your terminal.

In suggest mode, you approve each step. In full-auto mode, Codex runs the entire cycle without stopping. The right mode depends on how much you trust the task to converge.

When to let it iterate:

  • The errors are getting shorter or changing (progress is happening).
  • It’s working through import errors, typos, or missing dependencies.

When to step in:

  • It’s trying the same fix repeatedly.
  • The approach seems wrong, not just the details. Say “stop” and redirect: “Try a different approach. Use X library instead.”

AGENTS.md: standing instructions

An AGENTS.md file in your project root gives Codex standing instructions it reads at the start of every session. This is the Codex equivalent of Claude Code’s CLAUDE.md.

An example for a Python project:

AGENTS.md
# Project conventions

- Python >= 3.11
- Use uv for all package management (never pip)
- Use src/ layout (uv init --package)
- Run tests with: `uv run pytest`
- Run linting with: `uv run ruff check .`
- Format code with: `uv run ruff format .`
- Type check with: `uv run ty check src/`

## Code style

- Prefer httpx over requests for HTTP calls
- Use pathlib for file paths, not os.path

Codex builds an instruction chain in layers: first it checks ~/.codex/ for AGENTS.override.md or AGENTS.md (global scope). Then, from the project root down to your current directory, it checks each directory for AGENTS.override.md, then AGENTS.md, then any configured fallback filenames. Later files override earlier ones, so directory-specific instructions take precedence over project-wide ones.

If you work with both Codex and Claude Code, configure Codex to read CLAUDE.md as a fallback:

~/.codex/config.toml
project_doc_fallback_filenames = ["CLAUDE.md"]

Keep AGENTS.md under 150 lines and factual. Wrap commands in backticks so Codex can copy-paste them directly. Commit it to version control so every collaborator gets the same behavior.

For a comprehensive set of project scaffolding instructions you can paste into AGENTS.md or into a Codex session, see the Modern Python Project Setup Guide for AI Assistants.

Working with packages

When you need to make HTTP requests, parse dates, build a web server, or analyze data, there’s almost certainly a Python package for it.

Package names are not required upfront. Describe what you need:

I need to make HTTP requests to a REST API. Add the right package.

Codex will choose an appropriate library (like httpx or requests), install it with uv add (after you approve in suggest mode), and use it in the code it writes. The dependency gets recorded in pyproject.toml so it’s tracked with the project.

Testing

Automated tests verify your code does what you expect. They run fast and catch mistakes before users do.

Ask Codex to write tests alongside your code:

Write tests for the parse_csv function in src/my_app/parser.py

Or ask it to add tests after the fact:

Add tests for the existing code in src/my_app/

Codex will create test files, write test cases, and run them with pytest. A passing test suite looks like:

===== 5 passed in 0.12s =====

A failing test shows exactly what went wrong: what the code produced versus what was expected. Codex reads this output and can fix the code or the test depending on which is wrong.

A productive pattern is test-driven development: ask Codex to write a failing test for the behavior you want, then ask it to make the test pass:

Write a test that verifies parse_csv returns an empty list for an empty file. Then make the test pass.

For a guided introduction, see the tutorial Setting up testing with pytest and uv.

Linting and formatting

A linter scans code for common mistakes, style violations, and potential bugs. A formatter rewrites code to follow consistent style rules. Ruff handles both.

Add ruff to this project and configure it for linting and formatting

Once configured:

Run ruff on the project and fix any issues

Codex also supports the /review command, which analyzes diffs against a base branch and surfaces findings without modifying your working tree. This is useful for reviewing your own changes before committing.

For a full walkthrough, see the tutorial Set up Ruff for formatting and checking your code or the Complete Guide to Ruff.

Git integration

Codex understands git. It can read diffs, stage files, write commit messages, and create pull requests:

Commit these changes with a descriptive message
Show me what changed since the last commit

In suggest mode, Codex prompts before running any git command that modifies history or pushes to a remote. In full-auto mode, git commands run without prompting but are still scoped by the sandbox (network access is blocked, so pushes will fail unless you switch to danger-full-access).

Scripting with codex exec

The codex exec subcommand runs Codex without the interactive UI. Output goes to stdout, making it useful in scripts, CI pipelines, and automation:

codex exec "List all Python files that import requests but don't handle timeouts"

Pipe longer prompts through stdin:

cat <<'EOF' | codex exec -
Read all test files in tests/.
Check that every public function in src/my_app/ has at least one test.
List any untested functions.
EOF

Where interactive sessions are good for exploration, codex exec turns Codex into a scriptable tool you can embed in existing workflows.

Slash commands and navigation

Codex has built-in commands that start with /. The most useful for daily work:

  • /help shows all available commands.
  • /model switches between available models (GPT-5.4, GPT-5.3-Codex, codex-mini).
  • /mode switches autonomy modes mid-session.
  • /review runs a code review against a branch or uncommitted changes.
  • /theme previews and saves terminal color schemes.
  • /clear resets the conversation and starts fresh.
  • /exit ends the session.

Other navigation shortcuts:

  • Type @ in the composer to fuzzy-search files and insert paths into your prompt.
  • Prefix input with ! to run a shell command directly (output becomes part of the conversation context).
  • Press Ctrl+G to open your $EDITOR for drafting longer prompts.

Context window and session management

Every model has a finite context window, and longer sessions fill it up. When earlier details get summarized away, quality can degrade.

Watch for these signs:

  • Codex re-reads files it already examined.
  • It suggests changes you already rejected.
  • Responses become less relevant to your current task.

When this happens, start a fresh session with /clear or open a new terminal. Short, focused sessions produce better results than marathon ones.

Codex supports session resumption with codex resume, which restores transcript history, plan context, and prior approvals. This lets you pick up where you left off after a break without repeating context.

Choose a model

Use /model to switch models mid-session, or pass --model at launch.

Model Best for
GPT-5.4 Default. Strong reasoning, best code quality.
GPT-5.3-Codex Fast tasks where speed matters more than depth.
codex-mini-latest Cost-sensitive work and simpler tasks.

For most Python development, GPT-5.4 is the right default. Switch to a lighter model for tasks like renaming variables, formatting, or generating boilerplate.

When Codex gets it wrong

Codex will write broken code, choose the wrong library, overcomplicate a solution, or misunderstand what you asked. This is expected.

Common mistakes to watch for:

  • Using pip install instead of uv add. Without an AGENTS.md that says otherwise, Codex may default to pip. Add the instruction to your AGENTS.md.
  • Adding unnecessary dependencies. Codex sometimes pulls in a library for something the standard library handles. If you see an unfamiliar package, ask why it’s needed.
  • Hallucinating library APIs. Codex may call functions or use parameters that don’t exist in the library version you’re using. If code fails with AttributeError or TypeError on a library call, this is likely the cause.
  • Overengineering. Vague prompts lead to complex solutions. Be specific about what you want, and say “simplify this” when the result is more than the task requires.

How to redirect:

  • “Simplify this.” If the solution is more complex than it needs to be, say so directly.
  • “Try without that library.” If it pulled in a dependency you don’t want, tell it to use the standard library or a specific alternative.
  • “That’s not what I meant. I want X, not Y.” Restate the requirement. Being specific about the gap between what you got and what you wanted is faster than starting over.

In suggest mode, review the diff before approving changes. In full-auto mode, review the results after Codex finishes. Your judgment is part of the process.

Learn more

Handbook guides:

Official documentation:

Last updated on

Please submit corrections and feedback...