Skip to content

Create your first Python project with uv

uv

This Python uv tutorial walks you through building a text analysis tool that counts words, measures sentence length, and reports word frequency. No prior Python experience required; uv handles the Python install, project scaffolding, and dependency management for you.

Prerequisites

Before we begin, make sure you have uv installed on your system. You can install it following the directions from the uv documentation.

Tip

You do not have to have Python installed on your computer to run this tutorial.

Creating a New Project

Let’s create a project called “text_analyzer” that will analyze text statistics like word frequency, sentence length, and readability scores:

$ uv init text_analyzer
Initialized project `text-analyzer` at `/path/to/text_analyzer`
$ cd text_analyzer

uv prints a single confirmation line. If you see error: project name '...' is not valid, the directory you tried to create already exists; pick a fresh name or remove the existing directory first.

Notice the new files uv created in the project: pyproject.toml, main.py, README.md, a .python-version file pinning the interpreter, and a .gitignore. The directory is also a git repository ready for its first commit.

Look at the generated pyproject.toml, which stores the project configuration:

[project]
name = "text-analyzer"
version = "0.1.0"
description = "Add your description here"
readme = "README.md"
requires-python = ">=3.10"
dependencies = []

Adding Dependencies

If you see error: No pyproject.toml found in current directory or any parent directory, you ran the next commands outside the project. cd into text_analyzer first.

Our text analyzer will need some packages for data processing and analysis. Add pandas first:

$ uv add pandas
Using CPython 3.14.4
Creating virtual environment at: .venv
Resolved 6 packages in 221ms
Downloading numpy (5.0MiB)
Downloading pandas (9.5MiB)
 Downloaded numpy
 Downloaded pandas
Prepared 2 packages in 497ms
Installed 4 packages in 25ms
 + numpy==2.4.4
 + pandas==3.0.2
 + python-dateutil==2.9.0.post0
 + six==1.17.0

The exact Python version, package versions, and timings will differ on your machine. Notice the new .venv/ directory and uv.lock file in the project. The virtual environment holds the project’s Python interpreter and installed packages; the lockfile pins exact versions so anyone else can reproduce the environment with one command.

Add nltk for natural language processing:

$ uv add nltk
Resolved 12 packages in 249ms
Downloading nltk (1.5MiB)
 Downloaded nltk
Prepared 4 packages in 194ms
Installed 5 packages in 10ms
 + click==8.3.3
 + joblib==1.5.3
 + nltk==3.9.4
 + regex==2026.4.4
 + tqdm==4.67.3

Each uv add updates pyproject.toml, refreshes uv.lock, and installs the package into .venv/. The pyproject.toml now includes these dependencies:

[project]
name = "text-analyzer"
version = "0.1.0"
description = "Add your description here"
readme = "README.md"
requires-python = ">=3.10"
dependencies = [
    "nltk>=3.9.1",
    "pandas>=2.2.3",
]

Creating the Project Structure

Let’s create a directory for our sample data:

mkdir data

Let’s create a sample text file to analyze. Create data/sample.txt with this content:

The quick brown fox jumps over the lazy dog. This pangram contains every letter of the English alphabet at least once. Pangrams are useful for testing fonts, keyboards, and printers. The five boxing wizards jump quickly! How vexingly quick daft zebras jump.

Now let’s replace the contents of main.py with our analysis code:

import pandas as pd
import nltk
from collections import Counter
from pathlib import Path

nltk.download('punkt_tab')

class TextAnalyzer:
    """A class for analyzing text statistics."""

    def __init__(self):
        # Download required NLTK data (only needed once)
        nltk.download('punkt', quiet=True)

    def read_text(self, file_path):
        """Read text from a file."""
        return Path(file_path).read_text()

    def analyze_text(self, text):
        """Analyze text and return statistics."""
        # Tokenize text into sentences and words
        sentences = nltk.sent_tokenize(text)
        words = nltk.word_tokenize(text.lower())

        # Calculate basic statistics
        word_count = len(words)
        sentence_count = len(sentences)
        avg_sentence_length = word_count / sentence_count

        # Calculate word frequencies
        word_freq = Counter(words)
        most_common = word_freq.most_common(5)

        # Create statistics dictionary
        stats = {
            "Total Words": word_count,
            "Total Sentences": sentence_count,
            "Average Sentence Length": round(avg_sentence_length, 2),
            "Unique Words": len(word_freq),
        }

        # Create word frequency DataFrame
        freq_df = pd.DataFrame(most_common, columns=['Word', 'Frequency'])

        return stats, freq_df

def main():
    # Initialize analyzer
    analyzer = TextAnalyzer()

    # Read and analyze sample text
    file_path = Path(__file__).parent / "data" / "sample.txt"
    text = analyzer.read_text(file_path)

    # Get analysis results
    stats, word_freq = analyzer.analyze_text(text)

    # Print results
    print("\nText Statistics:")
    for metric, value in stats.items():
        print(f"{metric}: {value}")

    print("\nMost Common Words:")
    print(word_freq.to_string(index=False))

if __name__ == "__main__":
    main()

The TextAnalyzer class reads text from a file, tokenizes it into sentences and words using NLTK, then computes statistics like word count and average sentence length. It also uses Counter to find the most common words and returns the results as both a dictionary and a pandas DataFrame.

Running the Project

To run your script:

uv run main.py

uv run resolves any pending changes in pyproject.toml, makes sure .venv/ is up to date, and then runs the command against the project’s interpreter. If you bypass uv run and call python main.py directly, you’ll likely see ModuleNotFoundError: No module named 'pandas' because your system Python isn’t using the project’s venv.

The first run also downloads the punkt_tab tokenizer NLTK needs. Expect output like this:

[nltk_data] Downloading package punkt_tab to /Users/you/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt_tab.zip.

Text Statistics:
Total Words: 49
Total Sentences: 5
Average Sentence Length: 9.8
Unique Words: 40

Most Common Words:
 Word  Frequency
  the          4
    .          4
quick          2
    ,          2
 jump          2

The [nltk_data] lines disappear on subsequent runs because the tokenizer is cached under ~/nltk_data/.

Adding Development Dependencies

Let’s add some development tools for testing and code quality. Add pytest first:

$ uv add --dev pytest
Resolved 17 packages in 102ms
Installed 5 packages in 12ms
 + iniconfig==2.3.0
 + packaging==26.2
 + pluggy==1.6.0
 + pygments==2.20.0
 + pytest==9.0.3

Then add Ruff:

$ uv add --dev ruff
Resolved 18 packages in 178ms
Installed 1 package in 3ms
 + ruff==0.15.12

Notice that --dev lands these in a separate [dependency-groups] table instead of the main dependencies list. They get installed in .venv/ like any other package, but uv sync --no-dev will skip them, which matters when you build a slim Docker image or deploy to production.

Both tools now sit in the dev dependency group in pyproject.toml:

[dependency-groups]
dev = [
    "pytest>=8.3.4",
    "ruff>=0.8.4",
]

Using Development Tools

Use the dev tools through uv run so they pick up the project’s venv automatically. If you call ruff directly without uv run, your shell either reports command not found: ruff or runs a different Ruff installed elsewhere on your machine.

Format the code with Ruff:

$ uv run ruff format .
1 file reformatted

Then run the linter with automatic fixes:

$ uv run ruff check --fix .
Found 1 error (1 fixed, 0 remaining).

The remaining error count drops to zero and Ruff exits cleanly. Open main.py and notice that the import block has been reordered (standard-library imports first, third-party imports next, each group sorted alphabetically). That’s Ruff’s I001 rule auto-fixing the import order.

Last updated on

Please submit corrections and feedback...