Introduction
In 2026, Hugging Face Hub is the go-to platform for AI experts, hosting over 1 million models, datasets, and Spaces apps. It streamlines sharing, versioning, and inference for Transformer models with a unified API, native Docker support, Git LFS, and Python SDK. Why does it matter? Distributed ML workflows demand seamless collaboration: push a fine-tuned Llama 3 to a private repo, host a Gradio Space for interactive demos, or pull massive datasets like LAION-5B without downtime. This expert tutorial guides you step-by-step, from secure login to CI/CD automation, with 100% functional code. You'll bookmark this for your production pipelines. (128 words)
Prerequisites
- Python 3.10+ installed
- Hugging Face account (create one at huggingface.co)
- HF access token (generate in Settings > Access Tokens, 'write' role)
- Git and Git LFS set up (
git lfs install) - Libraries:
pip install huggingface_hub transformers datasets accelerate - GPU access recommended for inference tests
CLI Installation and Login
pip install -U huggingface_hub[cli]
huggingface-cli login --token hf_XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
huggingface-cli whoamiThis command installs the official CLI and authenticates your local session with a read/write token. The --token flag skips prompts; replace with your real token. whoami verifies authentication: a common pitfall is an expired token blocking all uploads.
First Model Upload
Before uploading, prepare a local model. We'll use a concrete example: a sentiment classifier fine-tuned on DistilBERT.
Prepare and Push a Model
from huggingface_hub import HfApi, HfFolder
import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification
# Load an example local model (simulated)
model_name = "distilbert-base-uncased-finetuned-sst-2-english"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
# Save locally
model.save_pretrained("./my-local-model")
tokenizer.save_pretrained("./my-local-model")
# HfApi for upload
api = HfApi()
api.upload_folder(
folder_path="./my-local-model",
repo_id="your-username/my-first-model",
repo_type="model",
commit_message="First push of fine-tuned DistilBERT"
)
print("Model uploaded successfully!")This script loads a pre-trained model, saves it locally, then uploads via HfApi.upload_folder(). It automatically handles Git LFS for weights >10MB. Pitfall: without repo_type="model", it creates a dataset by mistake; always specify commit_message for versioning.
Advanced Dataset Management
HF datasets support Parquet, streaming, and metadata. Ideal for corpora like Common Crawl.
Create and Push a Dataset
from datasets import Dataset
from huggingface_hub import HfApi
import pandas as pd
# Example dataset: sentiments
data = {
"text": ["I love this movie!", "This is awful."],
"label": [1, 0]
}
df = pd.DataFrame(data)
dataset = Dataset.from_pandas(df)
# Push
dataset.push_to_hub("your-username/my-sentiments-dataset", private=False)
# Or via API for fine control
api = HfApi()
api.upload_dataset(
dataset=dataset,
repo_id="your-username/my-advanced-dataset",
commit_message="Sentiments dataset with splits"
)
print("Dataset published!")Here, we create a Hugging Face Dataset from a Pandas DataFrame and push it directly. push_to_hub() is a shortcut; upload_dataset() allows train/test splits. Note: datasets >1GB require Git LFS; test with load_dataset() after upload.
Pull and Local Inference
from transformers import pipeline
from huggingface_hub import snapshot_download
# Download a full repo
model_path = snapshot_download(repo_id="your-username/my-first-model", local_dir="./local_model")
# Inference pipeline
classifier = pipeline("sentiment-analysis", model=model_path, tokenizer=model_path)
result = classifier("This tutorial is excellent!")
print(result) # [{'label': 'POSITIVE', 'score': 0.9998}]
# Streaming for large datasets
dataset = load_dataset("your-username/my-sentiments-dataset", streaming=True)
for example in dataset["train"]:
print(example["text"])snapshot_download() clones an entire repo locally, avoiding redownloads. The pipeline() auto-loads tokenizer+model. For streaming datasets, use streaming=True to avoid OOM on 100GB+ corpora; pitfall: forgetting local_dir reloads everything.
Deploy a Gradio Space
HF Spaces host ML apps with zero config (Gradio/Streamlit/Docker).
app.py for Gradio Space
import gradio as gr
from transformers import pipeline
classifier = pipeline("sentiment-analysis", "distilbert-base-uncased-finetuned-sst-2-english")
def analyze(text):
result = classifier(text)
return f"Label: {result[0]['label']}, Score: {result[0]['score']:.4f}"
iface = gr.Interface(
fn=analyze,
inputs=gr.Textbox(label="Your text"),
outputs=gr.Textbox(label="Analysis"),
title="HF Hub Sentiment Analyzer"
)
if __name__ == "__main__":
iface.launch(share=True)This app.py builds a Gradio UI for inference. Push to an HF Space repo (type 'Space', SDK 'Gradio') for auto-deployment on GPU T4. share=True generates a public link; in production, use HF secrets for private tokens.
Dockerfile for Custom Space
FROM python:3.11-slim
WORKDIR /code
COPY ./requirements.txt /code/requirements.txt
RUN pip install --no-cache-dir --upgrade -r /code/requirements.txt
COPY ./app.py /code/app.py
CMD ["python", "app.py"]For advanced Docker Spaces, this Dockerfile creates a minimal image (slim Python). Add HF_TOKEN as an HF secret. Auto-builds on push; pitfall: forget COPY requirements.txt and pip fails at runtime.
Versioning and PR Merges
from huggingface_hub import HfApi
api = HfApi()
# Create branch and PR
api.create_repo(repo_id="your-username/my-model-v2", repo_type="model", exist_ok=True)
api.create_branch(repo_id="your-username/my-model-v2", branch="feature/v2")
# Merge PR programmatically (after manual push to branch)
# api.merge_pull_request(repo_id="your-username/my-model-v2", pull_request_id=1)
# Tag release
api.create_tag(repo_id="your-username/my-model-v2", tag="v1.0.0")
print("Versioning applied!")create_branch() and create_tag() leverage HF's native Git. PRs trigger CI/CD; auto-merge via GitHub Actions webhooks. Pro tip: gated repos need allow_patterns for conditional access.
Best Practices
- Always version: Use semver tags (v1.0.0) and branches for A/B model testing.
- Secure tokens: Store in HF Secrets or env vars, never hardcode in scripts.
- Optimize LFS: Track only
*.binandtokenizer.json; ignore temp checkpoints. - Test offline: Use
huggingface-cli downloadfor CI/CD without HF network. - Monitor usage: Free Spaces limited to 30min idle; upgrade to Pro for scaling.
Common Errors to Avoid
- Invalid token: '401 Unauthorized' → regenerate with 'write' scopes and relogin.
- LFS not tracked: Files truncated at 10MB →
git lfs track "*.safetensors"and repush. - OOM on pull: Large models crash → use
local_dir_use_symlinks=Falseor streaming. - Space down: HF logs show missing deps → generate full
pip freeze > requirements.txt.
Next Steps
- Official docs: Hugging Face Hub Docs
- Advanced: Integrate with Ray Serve for scaling Spaces.
- Training: Check our AI courses at Learni on Transformers and MLOps deployment.
- Example repo: Fork this Space.