Build a Python library with a C extension
Python handles strings well. C handles them faster. In this tutorial, you’ll write three string functions in C, compile them into a shared library, and call them from Python using cffi. The result is a normal Python package that anyone can import.
Prerequisites
You need two things installed:
- uv (installation guide)
- A C compiler
Most macOS and Linux systems already have a C compiler. Check by running:
cc --versionIf that fails, install one:
- macOS: Run
xcode-select --install - Ubuntu/Debian: Run
sudo apt install build-essential - Fedora: Run
sudo dnf install gcc
Creating the Project
Create a new library project with uv:
uv init string_utils --lib
cd string_utilsThis generates a src/string_utils/ directory with an __init__.py file and a pyproject.toml that looks like this:
[project]
name = "string-utils"
version = "0.1.0"
description = "Add your description here"
readme = "README.md"
requires-python = ">=3.12"
dependencies = []Add cffi as a dependency:
uv add cffiWriting the C Code
Create a file called string_ops.c inside src/string_utils/. This file contains three functions: one counts words, one reverses a string, and one checks for palindromes.
#include <string.h>
#include <ctype.h>
#include <stdlib.h>
int word_count(const char *text) {
int count = 0;
int in_word = 0;
while (*text) {
if (isspace((unsigned char)*text)) {
in_word = 0;
} else if (!in_word) {
in_word = 1;
count++;
}
text++;
}
return count;
}
char *reverse(const char *text) {
int len = strlen(text);
char *result = (char *)malloc(len + 1);
if (!result) return NULL;
for (int i = 0; i < len; i++) {
result[i] = text[len - 1 - i];
}
result[len] = '\0';
return result;
}
int is_palindrome(const char *text) {
int len = strlen(text);
/* Build a lowercase, letters-only copy */
char *cleaned = (char *)malloc(len + 1);
if (!cleaned) return 0;
int j = 0;
for (int i = 0; i < len; i++) {
if (isalpha((unsigned char)text[i])) {
cleaned[j++] = tolower((unsigned char)text[i]);
}
}
cleaned[j] = '\0';
int left = 0;
int right = j - 1;
while (left < right) {
if (cleaned[left] != cleaned[right]) {
free(cleaned);
return 0;
}
left++;
right--;
}
free(cleaned);
return 1;
}Each function takes a C string (const char *) as input. word_count walks the string character by character, counting transitions from whitespace to non-whitespace. reverse allocates a new string and fills it back-to-front. is_palindrome strips non-letter characters, lowercases the remainder, and checks whether it reads the same forwards and backwards.
Note
The reverse function returns a pointer to newly allocated memory. The caller (our Python wrapper) is responsible for freeing it. cffi makes this straightforward.
Compiling the C Code
Before Python can call these functions, the C source needs to be compiled into a shared library. Create a build script at the project root:
"""Compile string_ops.c into a shared library."""
import os
import subprocess
import sys
def main():
src_dir = os.path.join(os.path.dirname(__file__), "src", "string_utils")
c_file = os.path.join(src_dir, "string_ops.c")
if sys.platform == "win32":
lib_name = "string_ops.dll"
cmd = ["cl", "/LD", "/O2", c_file,
f"/Fe{os.path.join(src_dir, lib_name)}"]
elif sys.platform == "darwin":
lib_name = "string_ops.dylib"
cmd = ["cc", "-shared", "-fPIC", "-O2", "-std=c99",
"-o", os.path.join(src_dir, lib_name), c_file]
else:
lib_name = "string_ops.so"
cmd = ["cc", "-shared", "-fPIC", "-O2", "-std=c99",
"-o", os.path.join(src_dir, lib_name), c_file]
print(f"Compiling {c_file} -> {lib_name}")
subprocess.check_call(cmd)
print("Done.")
if __name__ == "__main__":
main()The -shared flag tells the compiler to produce a shared library instead of an executable. The -fPIC flag generates position-independent code, which shared libraries require on Linux and macOS.
Run the build script:
uv run python build_c.pyYou should see output like:
$ uv run python build_c.py
Compiling src/string_utils/string_ops.c -> string_ops.so
Done.
The compiled library now sits next to the C source file in src/string_utils/.
Writing the Python Wrapper
Replace the contents of src/string_utils/__init__.py with a wrapper that uses cffi to load the shared library and expose the C functions as Python functions:
"""String utilities implemented in C, loaded via cffi."""
import os
import sys
import cffi
ffi = cffi.FFI()
# Declare the C function signatures
ffi.cdef("""
int word_count(const char *text);
char *reverse(const char *text);
int is_palindrome(const char *text);
void free(void *ptr);
""")
# Load the compiled shared library
_dir = os.path.dirname(__file__)
if sys.platform == "win32":
_lib_path = os.path.join(_dir, "string_ops.dll")
elif sys.platform == "darwin":
_lib_path = os.path.join(_dir, "string_ops.dylib")
else:
_lib_path = os.path.join(_dir, "string_ops.so")
_lib = ffi.dlopen(_lib_path)
def word_count(text: str) -> int:
"""Count the number of words in a string."""
return _lib.word_count(text.encode("utf-8"))
def reverse(text: str) -> str:
"""Reverse a string."""
result_ptr = _lib.reverse(text.encode("utf-8"))
result = ffi.string(result_ptr).decode("utf-8")
_lib.free(result_ptr) # free the C-allocated memory
return result
def is_palindrome(text: str) -> bool:
"""Check whether a string is a palindrome (ignoring case and non-letters)."""
return bool(_lib.is_palindrome(text.encode("utf-8")))This wrapper does three things:
ffi.cdef()tells cffi what C functions exist and what their signatures look likeffi.dlopen()loads the compiled shared library into memory- Each Python function encodes the input string to bytes, calls the C function, and converts the result back to a Python type
The reverse wrapper also frees the memory that the C function allocated, preventing a memory leak.
Testing the Library
Try the library interactively:
uv run python -c "
import string_utils
print(string_utils.word_count('hello world'))
print(string_utils.reverse('Python'))
print(string_utils.is_palindrome('racecar'))
"Expected output:
2
nohtyP
TrueFor repeatable tests, add pytest as a development dependency:
uv add --dev pytestCreate a test file at the project root:
import string_utils
def test_word_count():
assert string_utils.word_count("hello world") == 2
assert string_utils.word_count("one") == 1
assert string_utils.word_count(" spaced out ") == 2
assert string_utils.word_count("") == 0
def test_reverse():
assert string_utils.reverse("hello") == "olleh"
assert string_utils.reverse("Python") == "nohtyP"
assert string_utils.reverse("") == ""
def test_is_palindrome():
assert string_utils.is_palindrome("racecar") is True
assert string_utils.is_palindrome("A man a plan a canal Panama") is True
assert string_utils.is_palindrome("hello") is False
assert string_utils.is_palindrome("Was it a car or a cat I saw") is TrueRun the tests:
uv run pytest test_string_utils.py -v$ uv run pytest test_string_utils.py -v
============================= test session starts ==============================
collected 3 items
test_string_utils.py::test_word_count PASSED [ 33%]
test_string_utils.py::test_reverse PASSED [ 66%]
test_string_utils.py::test_is_palindrome PASSED [100%]
============================== 3 passed in 0.05s ===============================
Project Structure
Your finished project looks like this:
-
- pyproject.toml
- build_c.py
- test_string_utils.py
- uv.lock
- README.md
-
-
- init.py
- string_ops.c
- string_ops.so
-
The .so file (or .dylib on macOS, .dll on Windows) is the compiled shared library. The .c file ships with your project so anyone can recompile it for their platform.
How cffi Connects C and Python
The connection between C and Python happens through cffi’s ABI mode. Here’s the sequence:
ffi.cdef()parses C declarations so cffi knows each function’s argument types and return typeffi.dlopen()loads the compiled shared library (.so,.dylib, or.dll) into the Python process- When Python calls
_lib.word_count(b"hello world"), cffi converts the Python bytes to a Cconst char *, calls the C function, and converts the Cintresult back to a Pythonint
Strings need special handling because C strings are null-terminated byte arrays while Python strings are Unicode objects. The wrapper encodes Python strings to UTF-8 bytes before passing them to C, and decodes the results back to Python strings.
Next Steps
This tutorial used cffi’s ABI mode (ffi.dlopen), which loads a pre-compiled shared library at runtime. cffi also supports an API mode (ffi.set_source) that compiles the C code during package installation. The API mode produces faster function calls because it avoids the dynamic lookup overhead of dlopen.
For distributing your package, consider:
- Adding
build_c.pyas a pre-install step in your build backend configuration - Publishing to PyPI with platform-specific wheels that include the compiled library