showdown-claude-skill

You are executing the /showdown skill. The user wants to pit different LLMs against each other with the same prompt.

Prerequisites Check

First, verify CLIProxyAPI is running:

curl -s -o /dev/null -w "%{http_code}" http://localhost:8317/v1/models 2>/dev/null || echo "NOT_RUNNING"

If NOT running, tell the user:

CLIProxyAPI is not running. Start it with: brew services start cliproxyapi or cliproxyapi Then re-run /showdown.

If running, proceed.

Execute Comparison

Run the compare script with the user's prompt:

bash ~/.claude/skills/showdown/scripts/showdown.sh "$ARGUMENTS"

The script returns a JSON array with results from each enabled model. Each entry contains:

provider: Human-readable provider name
model: Model ID used
response: The model's response text
duration_seconds: How long the request took
error: Error message if the request failed (null otherwise)
status_code: HTTP status code

Format the Output

Present the results as a structured comparison:

For each model response:

Use this format:

## <Provider Name> (<model-id>) - <duration>s

<response content>

After all responses — Comparison Analysis (FIXED TEMPLATE):

You MUST use this exact template structure for the analysis. Do not deviate.

## Comparison Analysis

### Agreement
- <bullet points where models broadly agree>

### Disagreements

| Topic | Claude | GPT | Gemini |
|-------|--------|-----|--------|
| <point of divergence> | <stance> | <stance> | <stance> |
| ... | ... | ... | ... |

### Style & Approach

| Dimension | Claude | GPT | Gemini |
|-----------|--------|-----|--------|
| Tone | <description> | <description> | <description> |
| Length | <description> | <description> | <description> |
| Structure | <description> | <description> | <description> |
| Use of examples | <description> | <description> | <description> |

### Best Response
**Winner:** <model name>
**Reasoning:** <2-3 sentences explaining why>

### Additional Observations
<optional: topic-specific insights that don't fit the template above>

Error handling:

If a model failed, show: **<Provider>**: Failed - <error message>
If all models failed, suggest checking CLIProxyAPI status and authentication
If only some failed, show successful responses and note failures

Save Output (Prompt User)

After presenting the comparison, always ask the user whether they want to save the full output as a markdown file using AskUserQuestion. Include a second question asking if they want to run judge mode.

If the user says yes to saving, save to ./showdown-output/ in the current working directory:

Create the ./showdown-output/ directory if it doesn't exist (mkdir -p)
Generate filename: showdown-YYYY-MM-DD-HHMMSS.md using the current timestamp
Write a markdown file with this structure:

# Showdown: <short summary of the prompt topic>

**Date:** <YYYY-MM-DD HH:MM TZ>
**Models:** <list of models used>
**Prompt:**

> <the original user prompt, blockquoted>

---

## <Provider 1> (<model-id>) - <duration>s

<full response exactly as returned, preserving all formatting>

---

## <Provider 2> (<model-id>) - <duration>s

<full response exactly as returned, preserving all formatting>

---

## <Provider 3> (<model-id>) - <duration>s

<full response exactly as returned, preserving all formatting>

---

## Comparison Analysis

### Agreement
<bullet points>

### Disagreements
| Topic | Claude | GPT | Gemini |
|-------|--------|-----|--------|
| ... | ... | ... | ... |

### Style & Approach
| Dimension | Claude | GPT | Gemini |
|-----------|--------|-----|--------|
| ... | ... | ... | ... |

### Best Response
**Winner:** <model>
**Reasoning:** <explanation>

### Additional Observations
<if any>

Tell the user the file path after saving.

Important: The saved markdown must contain the complete, unabridged responses from each model AND the full comparison analysis, formatted exactly as presented in the conversation. Do not summarize or truncate.

If the user also wants to run judge mode, tell them to run /showdown judge.