freebird
Multi-provider free-AI router. Sends one task to several configured free providers in parallel, shows each provider's streaming output in its own tmux pane (opened in a new macOS Terminal window), and hands Claude a single summary.json to synthesise into one digested answer.
Trigger rules (important)
- Do not reach for this skill on your own initiative. It costs time and network quota.
- Invoke it only when the user types one of:
/freebird-review— code review on a file / directory (code-specialist models)/freebird-extract— vision extraction from an image file (vision models)/freebird-ask— general-purpose question (flagship chat models)/freebird-reason— hard thinking (reasoning-specialist models only: DeepSeek R1, Phi-4-reasoning, QwQ)/freebird-quick— fast consensus on a simple question (small fast models only)/freebird-setup— key wizard/freebird-doctor— health check on configured providers
- If the user talks about "free models" conversationally without a slash command, suggest running one — don't run it yourself.
Task profiles — which model per provider
freebird doesn't just pick any model with the right capability; each task has a hand-curated provider → model map in src/freebird/tasks.py. This means:
/freebird-reviewdeliberately picks dedicated code models where available (Qwen3-Coder on NVIDIA, Codestral on Mistral, Qwen2.5-Coder on Cloudflare/HF) and reasoning models on providers that have no code specialist (DeepSeek-R1-Distill on Groq/Together, Phi-4-reasoning on GitHub Models)./freebird-extractpicks vision-strongest per provider (Llama 3.2 90B Vision on NVIDIA, Pixtral Large on Mistral, Qwen2.5 VL 72B on HF, Llama 4 Scout on Groq)./freebird-askmixes flagship chat models across families so the answers are genuinely independent rather than five Llama 3.3 restatements (Nemotron on NVIDIA, Grok-3 on GH, DeepSeek-Chat on OpenRouter, Mistral Large, Llama 405B on SambaNova, Qwen 72B on HF)./freebird-reasonuses only reasoning specialists and bumps max_tokens to 8192 to accommodate chain-of-thought./freebird-quickuses only small fast models (8B class) and caps max_tokens at 1024.
To change the per-task picks, edit src/freebird/tasks.py — one dict per task.
CLI reference
Binary: ~/.local/bin/freebird (Python venv at ~/.freebird/venv/).
| Command | What it does |
|---|---|
freebird setup | Interactive wizard. Auto-imports the NVIDIA key from QS Blueprint .env.blueprint. Walks remaining providers; optionally opens signup/keys pages. |
freebird setup --non-interactive | Auto-import only, no prompts. |
freebird doctor | Parallel ping of every configured provider (tiny chat, max_tokens=4). Reports live / rate-limited / auth-rejected / no-key per provider with latency. |
freebird doctor --json | Machine-readable. |
freebird providers | Static list of all known providers and their capabilities. |
freebird query <provider> --model <m> --prompt "..." [--image PATH] --stream | Single-provider streaming query. Used internally by fanout; rarely called by Claude directly. |
freebird fanout --task code-review --input <path> | Code review fanout (code-specialist models). |
freebird fanout --task vision-extract --input <image> | Vision fanout (vision models). |
freebird fanout --task ask --prompt "..." | Generic chat fanout. |
freebird fanout --task reason --prompt "..." | Reasoning-specialist fanout (R1, Phi-4-reasoning, QwQ). |
freebird fanout --task quick --prompt "..." | Fast small-model fanout (8B class). |
Fanout output contract
Every freebird fanout run creates a workspace at /tmp/freebird-YYYYMMDD-HHMMSS/:
prompt.txt,system.txt— the inputs sent to each provider<provider>.md— raw streamed output per provider, with a===FREEBIRD-META===footer carrying status + token counts<provider>.sh— the script each tmux pane runssummary.json— the only file Claude should read after a fanout — contains anoutputsarray, each entry with{provider, model, file, status, latency_ms, prompt_tokens, completion_tokens, error_detail, content}
At the end of freebird fanout, the CLI prints:
[fanout] summary: /tmp/freebird-YYYYMMDD-HHMMSS/summary.json
Parse that path, read the JSON, then synthesise — never paste raw provider output to the user.
Config
- Location:
~/.freebird/config.toml(chmod 600). - Env vars override TOML:
NVIDIA_API_KEY,GROQ_API_KEY,CEREBRAS_API_KEY,OPENROUTER_API_KEY,GITHUB_MODELS_API_KEY,CLOUDFLARE_API_KEY(plusCLOUDFLARE_ACCOUNT_ID),MISTRAL_API_KEY,TOGETHER_API_KEY,SAMBANOVA_API_KEY,HUGGINGFACE_API_KEY.
Providers included
NVIDIA NIM, Groq, Cerebras Cloud, OpenRouter (:free pool only), GitHub Models, Cloudflare Workers AI, Mistral La Plateforme, Together AI (-Free models), SambaNova Cloud, HuggingFace Inference.
Deliberately excluded (user pays a subscription): OpenAI/Codex, Anthropic/Claude, Google Gemini, Moonshot/Kimi.