# Build a Python library with a C extension

Python handles strings well. C handles them faster. In this tutorial, you'll write three string functions in C, compile them into a shared library, and call them from Python using [cffi](https://cffi.readthedocs.io/). The result is a normal Python package that anyone can import.

## Prerequisites

You need two things installed:

- [uv](https://pydevtools.com/handbook/reference/uv.md) ([installation guide](https://docs.astral.sh/uv/getting-started/installation/))
- A C compiler

Most macOS and Linux systems already have a C compiler. Check by running:

```bash
cc --version
```

A working install prints a version banner like `Apple clang version 17.0.0` (macOS) or `gcc (Ubuntu 13.x) 13.x ...` (Linux). If you see `cc: command not found` or `xcrun: error: invalid active developer path`, you need to install one:

- macOS: Run `xcode-select --install`
- Ubuntu/Debian: Run `sudo apt install build-essential`
- Fedora: Run `sudo dnf install gcc`
Install the "Desktop development with C++" workload from [Visual Studio Build Tools](https://visualstudio.microsoft.com/visual-cpp-build-tools/). This provides the `cl` compiler.

After installation, run commands from the "Developer Command Prompt for VS" or "x64 Native Tools Command Prompt."
## Creating the Project

Create a new library project with uv:

```console
$ uv init string_utils --lib
Initialized project `string-utils` at `/path/to/string_utils`
$ cd string_utils
```

If you see `error: Project is already initialized in ...`, you ran `uv init` inside an existing uv project. Move up a directory first.

The `--lib` flag tells uv to scaffold a `src/string_utils/` package layout instead of a flat `main.py` script. That layout is what makes `import string_utils` work later.

This generates a `src/string_utils/` directory with an `__init__.py` file and a [pyproject.toml](https://pydevtools.com/handbook/reference/pyproject.toml.md) that looks like this:

```toml {filename="pyproject.toml"}
[project]
name = "string-utils"
version = "0.1.0"
description = "Add your description here"
readme = "README.md"
requires-python = ">=3.12"
dependencies = []
```

Add cffi as a dependency:

```console
$ uv add cffi
Using CPython 3.14.4
Creating virtual environment at: .venv
Resolved 3 packages in 163ms
Installed 3 packages in 3ms
 + cffi==2.0.0
 + pycparser==3.0
 + string-utils==0.1.0 (from file:///path/to/string_utils)
```

If you see an error about no `pyproject.toml` being found in the current directory or any parent, you ran `uv add` outside the `string_utils` directory. Run `cd string_utils` and try again.

Notice the new `.venv/` directory and `uv.lock` file. The `.venv/` is the project's isolated Python environment with cffi installed; `uv.lock` pins the exact versions of cffi and its transitive dependency `pycparser` so future installs are reproducible.

## Writing the C Code

Create a file called `string_ops.c` inside `src/string_utils/`. This file contains three functions: one counts words, one reverses a string, and one checks for palindromes.

```c {filename="src/string_utils/string_ops.c"}
#include <string.h>
#include <ctype.h>
#include <stdlib.h>

int word_count(const char *text) {
    int count = 0;
    int in_word = 0;

    while (*text) {
        if (isspace((unsigned char)*text)) {
            in_word = 0;
        } else if (!in_word) {
            in_word = 1;
            count++;
        }
        text++;
    }
    return count;
}

char *reverse(const char *text) {
    int len = strlen(text);
    char *result = (char *)malloc(len + 1);
    if (!result) return NULL;

    for (int i = 0; i < len; i++) {
        result[i] = text[len - 1 - i];
    }
    result[len] = '\0';
    return result;
}

int is_palindrome(const char *text) {
    int len = strlen(text);
    /* Build a lowercase, letters-only copy */
    char *cleaned = (char *)malloc(len + 1);
    if (!cleaned) return 0;

    int j = 0;
    for (int i = 0; i < len; i++) {
        if (isalpha((unsigned char)text[i])) {
            cleaned[j++] = tolower((unsigned char)text[i]);
        }
    }
    cleaned[j] = '\0';

    int left = 0;
    int right = j - 1;
    while (left < right) {
        if (cleaned[left] != cleaned[right]) {
            free(cleaned);
            return 0;
        }
        left++;
        right--;
    }
    free(cleaned);
    return 1;
}
```

Each function takes a C string (`const char *`) as input. `word_count` walks the string character by character, counting transitions from whitespace to non-whitespace. `reverse` allocates a new string and fills it back-to-front. `is_palindrome` strips non-letter characters, lowercases the remainder, and checks whether it reads the same forwards and backwards.

> [!NOTE]
> The `reverse` function returns a pointer to newly allocated memory. The caller (our Python wrapper) is responsible for freeing it. cffi makes this straightforward.

## Compiling the C Code

Before Python can call these functions, the C source needs to be compiled into a shared library. Create a build script at the project root:

```python {filename="build_c.py"}
"""Compile string_ops.c into a shared library."""

import os
import subprocess
import sys


def main():
    src_dir = os.path.join(os.path.dirname(__file__), "src", "string_utils")
    c_file = os.path.join(src_dir, "string_ops.c")

    if sys.platform == "win32":
        lib_name = "string_ops.dll"
        cmd = ["cl", "/LD", "/O2", c_file,
               f"/Fe{os.path.join(src_dir, lib_name)}"]
    elif sys.platform == "darwin":
        lib_name = "string_ops.dylib"
        cmd = ["cc", "-shared", "-fPIC", "-O2", "-std=c99",
               "-o", os.path.join(src_dir, lib_name), c_file]
    else:
        lib_name = "string_ops.so"
        cmd = ["cc", "-shared", "-fPIC", "-O2", "-std=c99",
               "-o", os.path.join(src_dir, lib_name), c_file]

    print(f"Compiling {c_file} -> {lib_name}")
    subprocess.check_call(cmd)
    print("Done.")


if __name__ == "__main__":
    main()
```

The `-shared` flag tells the compiler to produce a shared library instead of an executable. The `-fPIC` flag generates position-independent code, which shared libraries require on Linux and macOS.

Run the build script:

```bash
uv run python build_c.py
```

You should see output like:

```console
$ uv run python build_c.py
Compiling src/string_utils/string_ops.c -> string_ops.so
Done.
```

If the compiler complains with `fatal error: 'stdlib.h' file not found` or `'string.h' not found`, your toolchain headers aren't installed. On macOS, run `xcode-select --install`; on Linux, install the `build-essential` (Debian/Ubuntu) or `gcc` (Fedora) package.

Notice the new `string_ops.so` (or `string_ops.dylib` on macOS, `string_ops.dll` on Windows) sitting next to `string_ops.c` in `src/string_utils/`. That binary is what cffi will load at runtime.

## Writing the Python Wrapper

Replace the contents of `src/string_utils/__init__.py` with a wrapper that uses cffi to load the shared library and expose the C functions as Python functions:

```python {filename="src/string_utils/__init__.py"}
"""String utilities implemented in C, loaded via cffi."""

import os
import sys

import cffi

ffi = cffi.FFI()

# Declare the C function signatures
ffi.cdef("""
    int word_count(const char *text);
    char *reverse(const char *text);
    int is_palindrome(const char *text);
    void free(void *ptr);
""")

# Load the compiled shared library
_dir = os.path.dirname(__file__)
if sys.platform == "win32":
    _lib_path = os.path.join(_dir, "string_ops.dll")
elif sys.platform == "darwin":
    _lib_path = os.path.join(_dir, "string_ops.dylib")
else:
    _lib_path = os.path.join(_dir, "string_ops.so")

_lib = ffi.dlopen(_lib_path)


def word_count(text: str) -> int:
    """Count the number of words in a string."""
    return _lib.word_count(text.encode("utf-8"))


def reverse(text: str) -> str:
    """Reverse a string."""
    result_ptr = _lib.reverse(text.encode("utf-8"))
    result = ffi.string(result_ptr).decode("utf-8")
    _lib.free(result_ptr)  # free the C-allocated memory
    return result


def is_palindrome(text: str) -> bool:
    """Check whether a string is a palindrome (ignoring case and non-letters)."""
    return bool(_lib.is_palindrome(text.encode("utf-8")))
```

This wrapper does three things:

1. `ffi.cdef()` tells cffi what C functions exist and what their signatures look like
2. `ffi.dlopen()` loads the compiled shared library into memory
3. Each Python function encodes the input string to bytes, calls the C function, and converts the result back to a Python type

The `reverse` wrapper also frees the memory that the C function allocated, preventing a memory leak.

## Testing the Library

Try the library interactively:

```bash
uv run python -c "
import string_utils
print(string_utils.word_count('hello world'))
print(string_utils.reverse('Python'))
print(string_utils.is_palindrome('racecar'))
"
```

Expected output:

```
2
nohtyP
True
```

If you see `OSError: cannot load library '.../string_ops.so'`, the shared library doesn't exist yet. You skipped the build step; run `uv run python build_c.py` first.

For repeatable tests, add [pytest](https://pydevtools.com/handbook/reference/pytest.md) as a development dependency:

```console
$ uv add --dev pytest
Resolved 9 packages in 93ms
Installed 6 packages in 7ms
 + iniconfig==2.3.0
 + packaging==26.2
 + pluggy==1.6.0
 + pygments==2.20.0
 + pytest==9.0.3
```

Create a test file at the project root:

```python {filename="test_string_utils.py"}
import string_utils


def test_word_count():
    assert string_utils.word_count("hello world") == 2
    assert string_utils.word_count("one") == 1
    assert string_utils.word_count("  spaced   out  ") == 2
    assert string_utils.word_count("") == 0


def test_reverse():
    assert string_utils.reverse("hello") == "olleh"
    assert string_utils.reverse("Python") == "nohtyP"
    assert string_utils.reverse("") == ""


def test_is_palindrome():
    assert string_utils.is_palindrome("racecar") is True
    assert string_utils.is_palindrome("A man a plan a canal Panama") is True
    assert string_utils.is_palindrome("hello") is False
    assert string_utils.is_palindrome("Was it a car or a cat I saw") is True
```

Run the tests:

```bash
uv run pytest test_string_utils.py -v
```

```console
$ uv run pytest test_string_utils.py -v
============================= test session starts ==============================
collected 3 items

test_string_utils.py::test_word_count PASSED                             [ 33%]
test_string_utils.py::test_reverse PASSED                                [ 66%]
test_string_utils.py::test_is_palindrome PASSED                          [100%]

============================== 3 passed in 0.05s ===============================
```

## Project Structure

Your finished project looks like this:

{{< /filetree/folder >}}
    {{< /filetree/folder >}}
  {{< /filetree/folder >}}
{{< /filetree/container >}}

The `.so` file (or `.dylib` on macOS, `.dll` on Windows) is the compiled shared library. The `.c` file ships with your project so anyone can recompile it for their platform.

## How cffi Connects C and Python

The connection between C and Python happens through cffi's ABI mode. Here's the sequence:

1. `ffi.cdef()` parses C declarations so cffi knows each function's argument types and return type
2. `ffi.dlopen()` loads the compiled shared library (`.so`, `.dylib`, or `.dll`) into the Python process
3. When Python calls `_lib.word_count(b"hello world")`, cffi converts the Python bytes to a C `const char *`, calls the C function, and converts the C `int` result back to a Python `int`

Strings need special handling because C strings are null-terminated byte arrays while Python strings are Unicode objects. The wrapper encodes Python strings to UTF-8 bytes before passing them to C, and decodes the results back to Python strings.

## Next Steps

This tutorial used cffi's ABI mode (`ffi.dlopen`), which loads a pre-compiled shared library at runtime. cffi also supports an API mode (`ffi.set_source`) that compiles the C code during package installation. The API mode produces faster function calls because it avoids the dynamic lookup overhead of `dlopen`.

For distributing your package, consider:

- Adding `build_c.py` as a pre-install step in your [build backend](https://pydevtools.com/handbook/explanation/what-is-a-build-backend.md) configuration
- [Publishing to PyPI](https://pydevtools.com/handbook/tutorial/publishing-your-first-python-package-to-pypi.md) with platform-specific [wheels](https://pydevtools.com/handbook/reference/wheel.md) that include the compiled library

## Learn More

- [Complete example project on GitHub](https://github.com/python-developer-tooling-handbook/python-c-extension-example)
- [cffi documentation](https://cffi.readthedocs.io/en/stable/)
- [cffi ABI vs API mode](https://cffi.readthedocs.io/en/stable/overview.html#abi-versus-api)
- [uv documentation](https://docs.astral.sh/uv/)
- [Python C extensions overview (CPython docs)](https://docs.python.org/3/extending/extending.html)
- [setuptools reference](https://pydevtools.com/handbook/reference/setuptools.md)
