Create your first Python project
This tutorial helps you create a Python project with zero Python experience. You don’t even need Python installed on your computer. We’ll build a text analysis tool that processes sample text data.
Prerequisites
Before we begin, make sure you have uv installed on your system. You can install it following the directions from the uv documentation.
Tip
You do not have to have Python installed on your computer to run this tutorial.
Creating a New Project
Let’s create a project called “text_analyzer” that will analyze text statistics like word frequency, sentence length, and readability scores:
uv init text_analyzer
cd text_analyzer
This command creates a new directory with some initial files. Let’s look at the generated pyproject.toml, which will store our project configuration:
[project]
name = "text-analyzer"
version = "0.1.0"
description = "Add your description here"
readme = "README.md"
requires-python = ">=3.10"
dependencies = []
Adding Dependencies
Our text analyzer will need some packages for data processing and analysis. Let’s add them using uv:
# Add pandas for data analysis and statistics
uv add pandas
# Add nltk for natural language processing
uv add nltk
Each time we run uv add
, it updates our project configuration, creates or updates the lockfile, and installs the package in our project’s virtual environment. Our pyproject.toml now includes these dependencies:
[project]
name = "text-analyzer"
version = "0.1.0"
description = "Add your description here"
readme = "README.md"
requires-python = ">=3.10"
dependencies = [
"nltk>=3.9.1",
"pandas>=2.2.3",
]
Creating the Project Structure
Let’s create a proper project structure with our source code and sample data:
# Remove the default hello.py
rm hello.py
# Create our project structure
mkdir src
mkdir src/text_analyzer
mkdir src/text_analyzer/data
touch src/text_analyzer/__init__.py
touch src/text_analyzer/main.py
Let’s create a sample text file to analyze. Create src/text_analyzer/data/sample.txt
with this content:
The quick brown fox jumps over the lazy dog. This pangram contains every letter of the English alphabet at least once. Pangrams are useful for testing fonts, keyboards, and printers. The five boxing wizards jump quickly! How vexingly quick daft zebras jump.
Now let’s create our analysis code in src/text_analyzer/main.py
:
import pandas as pd
import nltk
from collections import Counter
from pathlib import Path
nltk.download('punkt_tab')
class TextAnalyzer:
"""A class for analyzing text statistics."""
def __init__(self):
# Download required NLTK data (only needed once)
nltk.download('punkt', quiet=True)
def read_text(self, file_path):
"""Read text from a file."""
return Path(file_path).read_text()
def analyze_text(self, text):
"""Analyze text and return statistics."""
# Tokenize text into sentences and words
sentences = nltk.sent_tokenize(text)
words = nltk.word_tokenize(text.lower())
# Calculate basic statistics
word_count = len(words)
sentence_count = len(sentences)
avg_sentence_length = word_count / sentence_count
# Calculate word frequencies
word_freq = Counter(words)
most_common = word_freq.most_common(5)
# Create statistics dictionary
stats = {
"Total Words": word_count,
"Total Sentences": sentence_count,
"Average Sentence Length": round(avg_sentence_length, 2),
"Unique Words": len(word_freq),
}
# Create word frequency DataFrame
freq_df = pd.DataFrame(most_common, columns=['Word', 'Frequency'])
return stats, freq_df
def main():
# Initialize analyzer
analyzer = TextAnalyzer()
# Read and analyze sample text
file_path = Path(__file__).parent / "data" / "sample.txt"
text = analyzer.read_text(file_path)
# Get analysis results
stats, word_freq = analyzer.analyze_text(text)
# Print results
print("\nText Statistics:")
for metric, value in stats.items():
print(f"{metric}: {value}")
print("\nMost Common Words:")
print(word_freq.to_string(index=False))
if __name__ == "__main__":
main()
This code creates a TextAnalyzer
class that:
- Reads text from a file
- Calculates basic statistics like word count and sentence length
- Finds the most common words
- Returns the results in both dictionary and DataFrame formats
Running the Project
To run your script:
uv run src/text_analyzer/main.py
You should see output showing statistics about our sample text and the most common words used, e.g.
Text Statistics:
Total Words: 49
Total Sentences: 5
Average Sentence Length: 9.8
Unique Words: 40
Most Common Words:
Word Frequency
the 4
. 4
quick 2
, 2
jump 2
Adding Development Dependencies
Let’s add some development tools for testing and code quality:
# Add pytest for testing
uv add --dev pytest
# Add ruff for linting and formatting
uv add --dev ruff
These will be added to a development dependencies group in pyproject.toml:
[tool.uv]
dev-dependencies = [
"pytest>=8.3.4",
"ruff>=0.8.4",
]
Using Development Tools
Now we can use our development tools through uv:
# Format code with ruff
uv run ruff format .
# Run linting with automated fixes
uv run ruff check --fix .