Running Workloads on Hugging Face Jobs
Overview
Run any workload on fully managed Hugging Face infrastructure. No local setup required—jobs run on cloud CPUs, GPUs, or TPUs and can persist results to the Hugging Face Hub.
Common use cases:
- Data Processing - Transform, filter, or analyze large datasets
- Batch Inference - Run inference on thousands of samples
- Experiments & Benchmarks - Reproducible ML experiments
- Model Training - Fine-tune models (see
model-trainerskill for TRL-specific training) - Synthetic Data Generation - Generate datasets using LLMs
- Development & Testing - Test code without local GPU setup
- Scheduled Jobs - Automate recurring tasks
For model training specifically: See the model-trainer skill for TRL-based training workflows.
When to Use This Skill
Use this skill when users want to:
- Run Python workloads on cloud infrastructure
- Execute jobs without local GPU/TPU setup
- Process data at scale
- Run batch inference or experiments
- Schedule recurring tasks
- Use GPUs/TPUs for any workload
- Persist results to the Hugging Face Hub
Key Directives
When assisting with jobs:
-
ALWAYS use
hf_jobs()MCP tool - Submit jobs usinghf_jobs("uv", {...})orhf_jobs("run", {...}). Thescriptparameter accepts Python code directly. Do NOT save to local files unless the user explicitly requests it. Pass the script content as a string tohf_jobs(). -
Always handle authentication - Jobs that interact with the Hub require
HF_TOKENvia secrets. See Token Usage section below. -
Provide job details after submission - After submitting, provide job ID, monitoring URL, estimated time, and note that the user can request status checks later.
-
Set appropriate timeouts - Default 30min may be insufficient for long-running tasks.
Prerequisites Checklist
Before starting any job, verify:
✅ Account & Authentication
- Hugging Face Account with Pro, Team, or Enterprise plan (Jobs require paid plan)
- Authenticated login: Check with
hf_whoami() - HF_TOKEN for Hub Access ⚠️ CRITICAL - Required for any Hub operations (push models/datasets, download private repos, etc.)
- Token must have appropriate permissions (read for downloads, write for uploads)
✅ Token Usage (See Token Usage section for details)
When tokens are required:
- Pushing models/datasets to Hub
- Accessing private repositories
- Using Hub APIs in scripts
- Any authenticated Hub operations
How to provide tokens:
# hf_jobs MCP tool — $HF_TOKEN is auto-replaced with real token:
{"secrets": {"HF_TOKEN": "$HF_TOKEN"}}
# HfApi().run_uv_job() — MUST pass actual token:
from huggingface_hub import get_token
secrets={"HF_TOKEN": get_token()}
⚠️ CRITICAL: The $HF_TOKEN placeholder is ONLY auto-replaced by the hf_jobs MCP tool. When using HfApi().run_uv_job(), you MUST pass the real token via get_token(). Passing the literal string "$HF_TOKEN" results in a 9-character invalid token and 401 errors.
Token Usage Guide
Understanding Tokens
What are HF Tokens?
- Authentication credentials for Hugging Face Hub
- Required for authenticated operations (push, private repos, API access)
- Stored securely on your machine after
hf auth login
Token Types:
- Read Token - Can download models/datasets, read private repos
- Write Token - Can push models/datasets, create repos, modify content
- Organization Token - Can act on behalf of an organization
When Tokens Are Required
Always Required:
- Pushing models/datasets to Hub
- Accessing private repositories
- Creating new repositories
- Modifying existing repositories
- Using Hub APIs programmatically
Not Required:
- Downloading public models/datasets
- Running jobs that don't interact with Hub
- Reading public repository information
How to Provide Tokens to Jobs
Method 1: Automatic Token (Recommended)
hf_jobs("uv", {
"script": "your_script.py",
"secrets": {"HF_TOKEN": "$HF_TOKEN"} # ✅ Automatic replacement
})
How it works:
$HF_TOKENis a placeholder that gets replaced with your actual token- Uses the token from your logged-in session (
hf auth login) - Most secure and convenient method
- Token is encrypted server-side when passed as a secret
Benefits:
- No token exposure in code
- Uses your current login session
- Automatically updated if you re-login
- Works seamlessly with MCP tools
Method 2: Explicit Token (Not Recommended)
hf_jobs("uv", {
"script": "your_script.py",
"secrets": {"HF_TOKEN": "hf_abc123..."} # ⚠️ Hardcoded token
})
When to use:
- Only if automatic token doesn't work
- Testing with a specific token
- Organization tokens (use with caution)
Security concerns:
- Token visible in code/logs
- Must manually update if token rotates
- Risk of token exposure
Method 3: Environment Variable (Less Secure)
hf_jobs("uv", {
"script": "your_script.py",
"env": {"HF_TOKEN": "hf_abc123..."} # ⚠️ Less secure than secrets
})
Difference from secrets:
envvariables are visible in job logssecretsare encrypted server-side- Always prefer
secretsfor tokens
Using Tokens in Scripts
In your Python script, tokens are available as environment variables:
# /// script
# dependencies = ["huggingface-hub"]
# ///
import os
from huggingface_hub import HfApi
# Token is automatically available if passed via secrets
token = os.environ.get("HF_TOKEN")
# Use with Hub API
api = HfApi(token=token)
# Or let huggingface_hub auto-detect
api = HfApi() # Automatically uses HF_TOKEN env var
Best practices:
- Don't hardcode tokens in scripts
- Use
os.environ.get("HF_TOKEN")to access - Let
huggingface_hubauto-detect when possible - Verify token exists before Hub operations
Token Verification
Check if you're logged in:
from huggingface_hub import whoami
user_info = whoami() # Returns your username if authenticated
Verify token in job:
import os
assert "HF_TOKEN" in os.environ, "HF_TOKEN not found!"
token = os.environ["HF_TOKEN"]
print(f"Token starts with: {token[:7]}...") # Should start with "hf_"
Common Token Issues
Error: 401 Unauthorized
- Cause: Token missing or invalid
- Fix: Add
secrets={"HF_TOKEN": "$HF_TOKEN"}to job config - Verify: Check
hf_whoami()works locally
Error: 403 Forbidden
- Cause: Token lacks required permissions
- Fix: Ensure token has write permissions for push operations
- Check: Token type at https://huggingface.co/settings/tokens
Error: Token not found in environment
- Cause:
secretsnot passed or wrong key name - Fix: Use
secrets={"HF_TOKEN": "$HF_TOKEN"}(notenv) - Verify: Script checks
os.environ.get("HF_TOKEN")
Error: Repository access denied
- Cause: Token doesn't have access to private repo
- Fix: Use token from account with access
- Check: Verify repo visibility and your permissions
Token Security Best Practices
- Never commit tokens - Use
$HF_TOKENplaceholder or environment variables - Use secrets, not env - Secrets are encrypted server-side
- Rotate tokens regularly - Generate new tokens periodically
- Use minimal permissions - Create tokens with only needed permissions
- Don't share tokens - Each user should use their own token
- Monitor token usage - Check token activity in Hub settings
Complete Token Example
# Example: Push results to Hub
hf_jobs("uv", {
"script": """
# /// script
# dependencies = ["huggingface-hub", "datasets"]
# ///
import os
from huggingface_hub import HfApi
from datasets import Dataset
# Verify token is available
assert "HF_TOKEN" in os.environ, "HF_TOKEN required!"
# Use token for Hub operations