Skip to content

Set up a data science project with uv

This tutorial sets up a data analysis project with uv so that every dependency is pinned, notebooks run in the right environment, and a collaborator can reproduce your setup with a single command.

Prerequisites

Install uv following the installation guide. No separate Python install is required.

Create the project

$ uv init weather_analysis
Initialized project `weather-analysis` at `/path/to/weather_analysis`
$ cd weather_analysis

This creates a project directory with a pyproject.toml, a main.py, and a README.md. The pyproject.toml stores all project metadata and dependencies:

[project]
name = "weather-analysis"
version = "0.1.0"
description = "Add your description here"
readme = "README.md"
requires-python = ">=3.13"
dependencies = []

Note

The requires-python value depends on which Python interpreter uv finds on your system. You may see a different version bound.

Add data science dependencies

$ uv add pandas matplotlib
Using CPython 3.13.5 interpreter
Creating virtual environment at: .venv
Resolved 14 packages in 160ms
Prepared 12 packages in 1.26s
Installed 12 packages in 139ms
 + contourpy==1.3.3
 + cycler==0.12.1
 + fonttools==4.62.1
 + kiwisolver==1.5.0
 + matplotlib==3.10.9
 + numpy==2.4.4
 + packaging==26.2
 + pandas==3.0.2
 + pillow==12.2.0
 + pyparsing==3.3.2
 + python-dateutil==2.9.0.post0
 + six==1.17.0

The exact versions and timings will differ on your machine. uv resolves compatible versions, installs them into an isolated virtual environment, and writes a uv.lock file that pins every transitive dependency. The lockfile guarantees that anyone cloning this project gets the same package versions.

Notice the new .venv/ directory and the new uv.lock file. The venv is where pandas, matplotlib, and their dependencies live; you never source .venv/bin/activate because every command in the rest of this tutorial runs through uv run, which uses that venv automatically. The lockfile records exact versions of every package uv resolved.

Note

If uv add prints error: No `pyproject.toml` found in current directory or any parent directory, you ran it outside the project. cd weather_analysis and try again.

The pyproject.toml now lists the direct dependencies:

dependencies = [
    "matplotlib>=3.10.8",
    "pandas>=3.0.2",
]

The exact version bounds will reflect whichever releases are current when you run the command.

Tip

Commit both pyproject.toml and uv.lock to version control. The lockfile pins exact versions of every transitive dependency, so collaborators get identical environments with uv sync.

Tip

Some scientific packages, such as certain CUDA toolkits and domain-specific Fortran libraries, are easier to install through conda channels than PyPI. If your project depends on packages like these, consider pixi or conda instead.

Add Jupyter as a dev dependency

Jupyter is a tool for interactive exploration, not a runtime dependency of the analysis code. Keep it in a dev group:

$ uv add --dev jupyter
Resolved 110 packages in 332ms
Prepared 94 packages in 2.58s
Installed 94 packages in 506ms
 + ...
 + jupyter==1.1.1
 + jupyterlab==4.5.7
 + notebook==7.5.6
 + ...

The full output lists every transitive dependency, around 94 packages total. Jupyter pulls in IPython, jupyter-server, jupyterlab, notebook, and a long list of supporting packages.

This adds Jupyter under the [dependency-groups] section in pyproject.toml:

[dependency-groups]
dev = [
    "jupyter>=1.1.1",
]

Notice the new [dependency-groups] table. Unlike dependencies = [...] in [project], packages here only install when you ask for them. When deploying the analysis as a script or scheduled job, exclude dev dependencies with uv sync --no-dev to get a leaner environment.

Launch Jupyter Lab from the project directory:

uv run jupyter lab

Jupyter Lab starts a local server on http://localhost:8888/lab and tries to open a browser tab. uv ensures the notebook kernel uses the project’s virtual environment, so every import resolves against your pinned dependencies. Press CONTROL-C in the terminal twice to stop the server. See How to Run a Jupyter Notebook with uv for more options.

Set up the project layout

Create directories for data and notebooks:

mkdir -p data notebooks

Create sample data

Add a sample CSV at data/weather.csv:

date,city,temp_high,temp_low,precipitation_mm,humidity_pct
2025-01-01,Portland,8,2,12.5,82
2025-01-02,Portland,7,1,0.0,65
2025-01-03,Portland,9,3,8.3,78
2025-01-04,Portland,6,-1,15.2,88
2025-01-05,Portland,10,4,0.0,60
2025-01-01,Phoenix,18,5,0.0,25
2025-01-02,Phoenix,20,7,0.0,22
2025-01-03,Phoenix,22,8,0.0,20
2025-01-04,Phoenix,19,6,2.1,35
2025-01-05,Phoenix,21,7,0.0,23

Write the analysis script

Replace the contents of main.py with:

import pandas as pd
import matplotlib
import matplotlib.pyplot as plt
from pathlib import Path

matplotlib.use("Agg")


def load_weather_data(path):
    df = pd.read_csv(path, parse_dates=["date"])
    df["temp_range"] = df["temp_high"] - df["temp_low"]
    return df


def summarize_by_city(df):
    return df.groupby("city").agg(
        avg_high=("temp_high", "mean"),
        avg_low=("temp_low", "mean"),
        total_precip=("precipitation_mm", "sum"),
        avg_humidity=("humidity_pct", "mean"),
    ).round(1)


def plot_temperature_comparison(df, output_path):
    fig, ax = plt.subplots(figsize=(8, 4))
    for city, group in df.groupby("city"):
        ax.plot(group["date"], group["temp_high"], marker="o", label=f"{city} high")
        ax.plot(group["date"], group["temp_low"], marker="s", label=f"{city} low",
                linestyle="--", alpha=0.6)
    ax.set_ylabel("Temperature (°C)")
    ax.set_title("Daily Temperatures by City")
    ax.legend()
    fig.tight_layout()
    fig.savefig(output_path, dpi=150)
    print(f"Chart saved to {output_path}")
    plt.close(fig)


def main():
    data_dir = Path(__file__).parent / "data"
    df = load_weather_data(data_dir / "weather.csv")

    summary = summarize_by_city(df)
    print("Weather Summary by City:")
    print(summary)
    print()

    plot_temperature_comparison(df, data_dir / "temperatures.png")


if __name__ == "__main__":
    main()

Run the analysis

uv run main.py

Expected output:

Weather Summary by City:
         avg_high  avg_low  total_precip  avg_humidity
city
Phoenix      20.0      6.6           2.1          25.0
Portland      8.0      1.8          36.0          74.6

Chart saved to data/temperatures.png

Explore in a notebook

With Jupyter running (uv run jupyter lab), create a new notebook in the notebooks/ directory. Every cell can import from the same environment:

import pandas as pd
from pathlib import Path

data_dir = Path("..") / "data"
df = pd.read_csv(data_dir / "weather.csv", parse_dates=["date"])
df.describe()

No kernel configuration is needed. Because you launched Jupyter with uv run, the notebook kernel uses the project’s virtual environment automatically.

Add testing

Data analysis code benefits from tests as much as any other software. Add pytest as a dev dependency:

$ uv add --dev pytest
Resolved 113 packages in 117ms
Prepared 3 packages in 33ms
Installed 3 packages in 16ms
 + iniconfig==2.3.0
 + pluggy==1.6.0
 + pytest==9.0.3

Only three new packages install: pytest reuses packaging and other shared dependencies already pulled in by Jupyter and matplotlib.

Create a test file at test_main.py:

import pandas as pd
from main import load_weather_data, summarize_by_city
from pathlib import Path


def test_load_weather_data():
    path = Path(__file__).parent / "data" / "weather.csv"
    df = load_weather_data(path)
    assert "temp_range" in df.columns
    assert len(df) == 10


def test_summarize_by_city():
    df = pd.DataFrame({
        "city": ["A", "A", "B"],
        "temp_high": [20, 22, 10],
        "temp_low": [10, 12, 5],
        "precipitation_mm": [0.0, 5.0, 10.0],
        "humidity_pct": [50, 60, 70],
    })
    summary = summarize_by_city(df)
    assert summary.loc["A", "avg_high"] == 21.0
    assert summary.loc["B", "total_precip"] == 10.0

Run the tests:

$ uv run pytest
======================== test session starts ========================
platform linux -- Python 3.13.5, pytest-9.0.3, pluggy-1.6.0
rootdir: /path/to/weather_analysis
collected 2 items

test_main.py ..                                               [100%]

========================= 2 passed in 0.42s =========================

If you see collected 0 items, pytest could not find the file: confirm test_main.py is in the project root next to main.py. See the pytest tutorial for more on testing Python projects.

Pin the Python version

Data science teams often need a consistent Python version across all contributors. Pin the version for this project:

$ uv python pin 3.13
Pinned `.python-version` to `3.13`

If 3.13 was not already installed, you will see a download line first:

Downloading cpython-3.13.5-linux-x86_64-gnu (download) (33.8MiB)
 Downloading cpython-3.13.5-linux-x86_64-gnu (download)
Updated `.python-version` from `3.12` -> `3.13`

Notice the new .python-version file in the project root. When anyone runs uv sync or uv run in this directory, uv installs and uses Python 3.13, even if their system has a different version.

Note

If uv python pin prints incompatible with the project's `requires-python` , the bound in pyproject.toml is newer than the version you tried to pin. Edit pyproject.toml to lower requires-python (for example, to >=3.13) first. See How to Change the Python Version of a uv Project for the full workflow.

After pinning, uv sync rebuilds the venv against the new interpreter:

$ uv sync
Using CPython 3.13.5
Removed virtual environment at: .venv
Creating virtual environment at: .venv
Resolved 113 packages in 1ms
Prepared 13 packages in 1.50s
Installed 109 packages in 735ms

If your existing venv was on a different Python version, you will see a Removed virtual environment at: .venv line first: uv tears down the old venv and rebuilds against 3.13. Either way, your dependencies and uv.lock are untouched; only the interpreter changes.

Final project structure

    • .python-version
    • pyproject.toml
    • uv.lock
    • main.py
    • test_main.py
    • README.md
      • weather.csv

Reproduce the environment

Anyone cloning this project can recreate the exact environment with one command:

$ uv sync
Using CPython 3.13.5
Creating virtual environment at: .venv
Resolved 113 packages in 1ms
Installed 109 packages in 735ms

This reads the lockfile and installs the pinned versions of every dependency. No manual version matching, no stale requirements files.

Next steps

Last updated on