How do I install Hugging Face Transformers with uv?

Run `uv add transformers` to add it to your project, or `uv pip install transformers` for a quick install. For GPU support, configure a PyTorch CUDA index in `pyproject.toml` and install `transformers[torch]`. For quantized models, add `accelerate` and `bitsandbytes` as dependencies.

How do I install Transformers with GPU support using uv?

Add a `[[tool.uv.index]]` entry pointing at `https://download.pytorch.org/whl/cu128` with `explicit = true`, then route `torch` and `torchvision` to it in `[tool.uv.sources]`. Add `transformers[torch]` to your project dependencies and run `uv sync`. For one-off installs, use `uv pip install transformers[torch] --torch-backend=auto`.

What are the Transformers extras I can install with uv?

The main extras are `transformers[torch]` (PyTorch backend), `transformers[vision]` (image models), `transformers[audio]` (audio models), and `transformers[sentencepiece]` (tokenizers for multilingual models). Install `accelerate` separately for distributed training and quantization support.

How to Install Hugging Face Transformers with uv

by Tim Hopper

uv Scientific Python

The Transformers library from Hugging Face provides access to thousands of pretrained models for text, image, audio, and multimodal tasks. Installing it with uv is straightforward for CPU inference but requires PyTorch index configuration for GPU acceleration, the same CUDA routing pattern covered in How to Install PyTorch with uv.

This guide covers three install paths: CPU-only inference, GPU training and inference with CUDA, and quantized model loading with accelerate and bitsandbytes.

Install for CPU inference

For tasks that run on CPU (sentiment analysis, text generation with small models, embeddings), add Transformers to your project:

uv add transformers

This installs Transformers and its core dependencies (huggingface-hub, tokenizers, safetensors) but not PyTorch. Most inference and training features require a deep learning backend, so install PyTorch alongside it:

uv add "transformers[torch]"

The [torch] extra pulls in a CPU-compatible PyTorch build from PyPI. Verify the install:

uv run python -c "from transformers import pipeline; print(pipeline('sentiment-analysis')('uv is fast'))"

The output should show a label and confidence score:

[{'label': 'POSITIVE', 'score': 0.9789}]

Install with GPU support

GPU acceleration requires PyTorch built against the right CUDA version. PyPI’s default PyTorch wheels are CPU-only on Windows and macOS. On Linux, PyPI carries CUDA 12.8 wheels as of PyTorch 2.9.1, but your system may need a different CUDA version.

Configure CUDA in pyproject.toml

Add a PyTorch CUDA index and route GPU packages to it. This example uses CUDA 12.8:

pyproject.toml

[project]
name = "my-ml-project"
version = "0.1.0"
requires-python = ">=3.10"
dependencies = [
    "transformers[torch]",
]

[[tool.uv.index]]
name = "pytorch-cu128"
url = "https://download.pytorch.org/whl/cu128"
explicit = true

[tool.uv.sources]
torch = [
  { index = "pytorch-cu128", marker = "sys_platform == 'linux' or sys_platform == 'win32'" },
]
torchvision = [
  { index = "pytorch-cu128", marker = "sys_platform == 'linux' or sys_platform == 'win32'" },
]

Then lock and sync:

uv lock
uv sync

The platform markers restrict CUDA builds to Linux and Windows. macOS falls back to PyPI’s CPU wheels because CUDA builds are not available for macOS. See How to Install PyTorch with uv for the full configuration reference, including multi-backend extras and ROCm support.

Verify GPU access

uv run python -c "import torch; print(torch.cuda.is_available(), torch.cuda.get_device_name(0))"

This should print True followed by your GPU name. If it prints False, check that your NVIDIA driver is installed (nvidia-smi) and that the PyTorch CUDA version matches your driver’s supported CUDA version.

Install GPU support without a project

For one-off experimentation without a pyproject.toml, use uv pip with --torch-backend:

uv venv --python 3.12 --seed --managed-python
source .venv/bin/activate
uv pip install "transformers[torch]" --torch-backend=auto

The --torch-backend=auto flag detects your GPU hardware and selects the matching PyTorch CUDA index. Valid values include auto, cpu, cu118, cu126, cu128, cu130, rocm6, and xpu.

Important

--torch-backend only works with uv pip commands. It does not work with uv lock, uv sync, or uv run. For project-level workflows, configure the PyTorch index in pyproject.toml as shown above.

Install extras for quantization and distributed training

Loading large models (7B+ parameters) on consumer GPUs requires quantization. The accelerate library is already a transitive dependency of transformers[torch], so adding bitsandbytes is the only extra step:

uv add bitsandbytes

With these installed, load a quantized model:

quantized_inference.py

from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig

quantization_config = BitsAndBytesConfig(load_in_4bit=True)

model = AutoModelForCausalLM.from_pretrained(
    "meta-llama/Llama-3.1-8B-Instruct",
    quantization_config=quantization_config,
    device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.1-8B-Instruct")

inputs = tokenizer("Explain virtual environments in one sentence.", return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=50)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Note

bitsandbytes requires a Linux system with an NVIDIA GPU. It does not support macOS or Windows natively.

accelerate enables multi-GPU and distributed training with Transformers’ Trainer class, even without quantization.

Choose the right extras

Transformers ships several optional dependency groups that pull in libraries for specific use cases:

Extra	What it adds	When to use it
`transformers[torch]`	PyTorch	Most NLP, vision, and generative tasks
`transformers[vision]`	Pillow	Image classification, object detection, image generation
`transformers[audio]`	librosa, soundfile	Speech recognition, audio classification
`transformers[sentencepiece]`	sentencepiece	Multilingual models (mBART, XLM-RoBERTa)
`transformers[video]`	av, decord	Video understanding models

Extras can be combined: uv add "transformers[torch,vision]" installs both PyTorch and Pillow.

Learn more

How to Install PyTorch with uv covers CUDA index routing, multi-backend extras, and --torch-backend in detail
Why Installing GPU Python Packages Is So Complicated explains why CUDA packages need special index configuration
Transformers installation docs for the official guide
Transformers on PyPI for the full list of optional extras

Last updated on May 19, 2026

How to install from a pylock.toml lockfile with pip How to Install JAX with uv

Please submit corrections and feedback...