What is a Python module?
A module is the basic unit of code Python can import. Any .py file is a module. So is any compiled extension, which ships as a .so file on Linux and macOS or a .pyd file on Windows. Anything that implements Python’s import protocol counts too, but the file cases cover what most developers ever see.
The PyPA glossary gives this concept three separate entries (Module, Pure Module, Extension Module), which is a hint that the vocabulary confuses people. This page walks through each flavor, and then untangles the bit that trips up almost every beginner: how modules relate to import packages and to the things you pip install.
The smallest possible module is one file
Create a file named greet.py with a single function:
def hello(name):
print(f"hi {name}")From a Python session in the same directory, import it and call the function:
$ python -c "import greet; greet.hello('world')"
hi world
That file is a module. There is no registration step, no manifest, no metadata. The filename (minus the .py) becomes the name you import.
Every module has a __name__
When Python loads a module, it sets a __name__ attribute on it. For imported modules, __name__ is the dotted import path. For the greet.py file loaded via import greet, that’s the string "greet". For json:
$ python -c "import json; print(json.__name__)"
json
The one exception is the module Python runs directly as a script. In that case __name__ is set to the string "__main__" instead. The if __name__ == "__main__": idiom at the bottom of many Python files uses this to run code only when the file is invoked as a script, not when it’s imported from elsewhere:
def hello(name):
print(f"hi {name}")
if __name__ == "__main__":
hello("world")Run python greet.py and it prints hi world. Run import greet from another file and the function definition gets loaded, but the hello("world") call at the bottom is skipped. That split lets the same file serve as both a library and a command-line entry point.
Pure modules vs extension modules
The PyPA glossary splits modules into two kinds based on how they’re implemented.
A pure module is written in Python and distributed as a .py source file. It’s portable across platforms because CPython (or PyPy, or any other interpreter) reads the source at import time. The standard library’s json package is pure Python:
$ python -c "import json; print(json.__file__)"
/opt/homebrew/Cellar/[email protected]/3.14.3_1/Frameworks/Python.framework/Versions/3.14/lib/python3.14/json/__init__.py
An extension module is compiled native code, usually written in C, C++, or Rust. It lives inside a wheel as a .so file on Linux and macOS or a .pyd file on Windows, and Python’s dynamic loader pulls it in at import time. Extension modules exist to reach C APIs (OpenSSL, SQLite, CUDA) and to run tight numeric loops without the interpreter overhead.
NumPy is the example most developers have touched without realizing. Its core array engine is an extension module:
$ uv run --with numpy python -c "from numpy._core import _multiarray_umath as m; print(m.__file__)"
/opt/homebrew/lib/python3.13/site-packages/numpy/_core/_multiarray_umath.cpython-313-darwin.so
The .cpython-313-darwin.so suffix encodes the Python version (3.13) and platform (macOS) the binary was built for, which is why wheels are platform-specific and source distributions exist as a fallback. A pure module built against Python 3.13 will keep working on Python 3.14 without changes. An extension module compiled for 3.13 usually will not, because CPython’s C API evolves between versions and the binary ABI goes with it.
A third flavor is worth naming to avoid surprise: built-in modules are compiled directly into the Python interpreter rather than loaded from a file. sys, builtins, _thread, and a handful of others fall into this group. They have no __file__ attribute because there is no separate file to point at:
$ python -c "import sys; print(sys.__name__); print(hasattr(sys, '__file__'))"
sys
False
The list at sys.builtin_module_names enumerates every built-in module baked into the running interpreter.
Modules vs import packages
Here is where the vocabulary gets slippery. A single file is a module. A directory containing an __init__.py file is an import package. An import package is itself a module (Python runs __init__.py when you import the directory, and the result gets a __name__), but it also contains other modules inside it.
A small project laid out as an import package looks like this:
-
- init.py
- helpers.py
-
- init.py
- engine.py
Every node in that tree is a module in Python’s eyes:
mypkgis a module (an import package) whose code is__init__.py.mypkg.helpersis a submodule sourced fromhelpers.py.mypkg.coreis a subpackage, itself a module, whose code is its own__init__.py.mypkg.core.engineis a submodule ofmypkg.core.
“Import package” is the technical term for the directory form. Plain “package” is ambiguous because it also refers to the zip file you download from PyPI, which is a different thing entirely. See What is a Python package? for that side of the story.
Namespace packages skip the __init__.py
A directory without an __init__.py can still be an importable package under the namespace package rules defined in PEP 420. Python treats it as an empty package whose contents can be spread across multiple directories on sys.path, which lets independent distributions contribute submodules to a shared top-level name. Most application code should stick with regular packages (an explicit __init__.py). Namespace packages shine when several distributions need to cooperate under one import root, the way zope.* or google.cloud.* do.
How Python finds a module at import time
When you write import greet, Python asks its import system to locate a module with that name. The mechanism is the importlib machinery, and the search happens in a defined order:
- Python first checks
sys.modules, the cache of already-imported modules. Ifgreetis there, it’s returned immediately and no disk I/O happens. - Otherwise, Python walks the entries in
sys.path, a list of directories and zip files, and asks each configured finder whether it can locategreet. - The first finder that claims the name loads the module into memory.
- Python runs the module’s top-level code, stores the result in
sys.modules, and binds the result in the importing namespace.
sys.path starts with the script’s directory (or the current working directory for an interactive session), then the standard library, then the active environment’s site-packages. That ordering explains a classic beginner trap: a local file named json.py in your project will shadow the standard library’s json package, because the script directory is checked before the standard library.
Inspecting sys.path from a running interpreter is the fastest way to debug “why can’t Python find my module?” problems:
$ python -c "import sys; print('\n'.join(sys.path))"
Python’s import system reference documents the full protocol, including finders, loaders, and the hooks for customizing how names get resolved.
The sys.modules cache has a consequence that bites every Python beginner at least once: importing a module a second time does not re-read the file. Edit greet.py in one window while a REPL session in another window already has greet imported, and the REPL will keep using the old version. That is why Jupyter, IPython, and tools like Flask’s hot reloader exist: each of them installs its own mechanism to force Python to reload the file without restarting the interpreter.
A distribution package is not a module
The word “package” does double duty in Python, and it’s the single biggest source of confusion in the ecosystem. When you run pip install requests, you are downloading a distribution package, which is a wheel or sdist archive that bundles one or more modules together with metadata (name, version, dependencies, entry points). When you write import requests, you are loading an import package, which is the directory of modules that the distribution unpacked into your environment’s site-packages.
Distribution and import names often match, but not always. The distribution beautifulsoup4 installs the import package bs4. The distribution Pillow installs PIL. A single distribution can ship several unrelated import packages, and a namespace package can be assembled from many distributions. The sibling page Distribution package vs import package covers the split in depth.
The short version: a wheel is not a module. A wheel is a zipped bundle of modules plus metadata. Modules are what Python loads; wheels are what package managers install.
Finding a module on disk
Every loaded module (except a handful of built-ins like sys and builtins) exposes a __file__ attribute that points at the file Python imported. That makes “where does this code actually live?” a one-line question:
python -c "import json; print(json.__file__)"Standard library modules live under the Python installation’s Lib/ directory (the exact path depends on how Python was installed). Third-party modules installed into an environment live under that environment’s site-packages/:
$ python -c "import requests; print(requests.__file__)"
/path/to/.venv/lib/python3.13/site-packages/requests/__init__.py
For an extension module, __file__ points at the compiled binary instead of a .py source file, which is the quickest way to tell the two flavors apart at a glance.
Learn More
- What is a Python package? covers distribution packages (wheels, sdists, conda packages).
- Distribution package vs import package goes deeper on the name collision.
- What is a virtual environment? explains the
site-packagesdirectory a module search usually ends in. - Python’s import system reference is the authoritative spec for finders, loaders, and
sys.path. - PEP 420: Implicit Namespace Packages is the standard that lets a directory be a package without
__init__.py. - PyPA glossary entries for Module, Pure Module, and Extension Module are the source definitions this page explains.