How to Install bitsandbytes
bitsandbytes provides 4-bit and 8-bit quantization for large language models, cutting GPU memory usage enough to run models that would otherwise not fit. Unlike PyTorch (see Why Installing GPU Python Packages Is So Complicated), bitsandbytes publishes platform-specific wheels to PyPI that bundle precompiled CUDA libraries for multiple toolkit versions. A plain pip install works on Linux, Windows, and macOS without extra index URLs.
Requirements
- Python >= 3.10
- PyTorch >= 2.3 (see How to Install PyTorch with uv)
- NVIDIA GPU with compute capability 6.0+ for GPU quantization (Pascal or newer)
- NVIDIA driver that supports CUDA 11.8 or later
bitsandbytes also supports CPU-only, AMD ROCm (preview), Intel XPU, and Apple Silicon. GPU quantization requires an NVIDIA GPU.
Install with pip or uv
Install PyTorch first, then bitsandbytes:
uv pip install torch
uv pip install bitsandbytesThe PyPI wheel ships with precompiled binaries for CUDA 11.8 through 13.0. At runtime, bitsandbytes detects the CUDA version provided by the installed PyTorch and loads the matching binary. No --index-url flag or CUDA version suffix is needed.
Important
Install PyTorch before bitsandbytes. bitsandbytes uses PyTorch’s CUDA runtime to detect the correct backend at import time. If PyTorch is missing or CPU-only, bitsandbytes falls back to CPU mode and quantization functions will raise errors.
Add to a uv project
For a uv project, add both packages as dependencies:
uv add torch bitsandbytesIf the project needs a specific CUDA build of PyTorch, configure the PyTorch index in pyproject.toml as described in How to Install PyTorch with uv. bitsandbytes itself resolves from PyPI with no extra configuration.
Install with conda
bitsandbytes is available on conda-forge, though the latest version (0.38.0) is older than the current PyPI release (0.49.x). If the project requires recent features like NF4 quantization support for newer GPU architectures, use pip or uv instead.
For projects where conda-forge’s version is acceptable:
pixi add bitsandbytesSee uv vs pixi vs conda for scientific Python for guidance on when conda-based tooling makes sense.
Verify the installation
Run this check to confirm bitsandbytes loaded the CUDA backend:
import bitsandbytes as bnb
import torch
# Should print the CUDA version bitsandbytes is using
print(bnb.cextension.BNB_BACKEND)
# Quick functional test: create a quantized linear layer
linear = bnb.nn.Linear8bitLt(256, 128, has_fp16_weights=False)
x = torch.randn(1, 256, dtype=torch.float16, device="cuda")
output = linear.to("cuda")(x)
print(f"Output shape: {output.shape}")If the backend prints CUDA, the GPU path is active. If it prints CPU, PyTorch either lacks CUDA support or no GPU was detected.
Troubleshooting
RuntimeError when calling quantization functions
bitsandbytes imports without error even when the native CUDA library fails to load. The error surfaces later, when quantization code runs. Check that:
- PyTorch was installed with CUDA support (
torch.cuda.is_available()returnsTrue) - The NVIDIA driver is recent enough for the installed CUDA toolkit
Wrong CUDA version detected
bitsandbytes reads the CUDA version from PyTorch, not from the system nvcc. If torch.version.cuda reports a different version than expected, reinstall PyTorch with the correct CUDA backend.
Override the detected version by setting the BNB_CUDA_VERSION environment variable:
BNB_CUDA_VERSION=128 python -c "import bitsandbytes"libcudart.so not found
This occurs when the CUDA runtime shared library is missing from the environment. PyTorch wheels bundle their own copy of libcudart, so this usually means PyTorch was installed as CPU-only. Reinstall PyTorch with CUDA support.
Related
Handbook articles:
- Why Installing GPU Python Packages Is So Complicated
- How to Install PyTorch with uv
- uv vs pixi vs conda for Scientific Python
External links: