SSH Server Management
<instructions>Manage remote Linux servers — connect, configure, deploy, administer with safety gates and persistent config.
Robustness Rules (MANDATORY — apply to ALL phases)
Fail-Fast
| Rule | Applies to |
|---|---|
Every Bash call MUST end with && echo "OK ..." || echo "FAILED ..." | ALL scripts |
On FAILED — stop current phase, report error to user, DO NOT retry same command blindly | ALL phases |
SSH commands MUST use -o ConnectTimeout=10 -o BatchMode=yes | ALL SSH calls |
| Max 2 retries per failed operation. After 2nd failure — report and stop | ALL phases |
| If a script exits non-zero — read its stderr, diagnose, fix root cause, then retry ONCE | Scripts |
Loop Protection
| Rule | Limit |
|---|---|
| Phase 2 (Connection Setup) — max 3 key attempts, then ask user | 3 keys |
| Phase 2 → Phase 5 round-trips — if sent back to Phase 2 more than once, stop and report | 1 re-entry |
| Phase 5 (Execute) — max 5 SSH commands per invocation. If task needs more, delegate to ssh-admin agent via Task | 5 commands |
| update-agent mode — max 3 servers per run. If more, process first 3 and report remaining | 3 servers |
| AskUserQuestion — max 3 questions per phase. If skill needs more info, summarize what's missing in one combined question | 3 per phase |
Timeouts
| Operation | Timeout | Action on timeout |
|---|---|---|
| SSH connection test | 10s (ConnectTimeout=10) | Report "Server unreachable", stop |
| server-discover.sh | 30s (timeout 30 bash ...) | Report partial results, continue |
| Any single SSH command | 60s (timeout 60 ssh ...) | Kill, report "Command timed out", ask user |
| Entire skill invocation | Do not exceed 15 SSH calls total | Stop, report progress, suggest continuing manually |
Fallback Strategy
If a script fails and cannot be fixed:
- Report the exact error to user: script name, exit code, stderr
- Attempt the same operation manually (inline Bash) — scripts are helpers, not gatekeepers
- If manual fallback also fails — report both attempts and ask user what to do
- NEVER silently swallow errors or continue with stale/missing data
Manual fallback examples:
| Failed script | Manual alternative |
|---|---|
| detect-mode.sh | Parse $ARGUMENTS yourself — keyword matching is simple |
| ssh-env-check.sh | Run ls ~/.ssh/id_* 2>/dev/null, ssh-add -l, cat ~/.ssh/config |
| server-discover.sh | Run individual commands via SSH: uname -a, docker version, df -h |
| claude-local-ops.sh | Read/write CLAUDE.local.md directly with Read/Edit tools |
Error Reporting (MANDATORY)
On ANY failure — before stopping or asking user — output:
SCRIPT_ERROR: <script-name>
EXIT_CODE: <code>
STDERR: <error message>
PHASE: <current phase>
ACTION: <what was attempted>
FALLBACK: <what will be tried next OR "asking user">
This is non-negotiable. Silent failures are bugs.
Phase 0: Mode Detection (MANDATORY FIRST STEP)
EXECUTE using Bash tool:
bash "${CLAUDE_SKILL_DIR}/scripts/detect-mode.sh" "$ARGUMENTS"
Output format:
ARGS: [arguments received]
MODE: [detected mode]
Use the MODE value and GOTO that mode section below.
Mode Reference
| Keyword in args | MODE |
|---|---|
| setup, new server, add server | setup |
| connect to, ssh to, login | connect |
| configure, config, harden | configure |
| update agent, refresh agent, refresh | update-agent |
| (any other text) | execute |
| (empty, no servers configured) | setup |
| (empty, servers configured) | execute (prompt user) |
Phase 1: Environment & Config Check
Runs for ALL modes before branching.
EXECUTE using Bash tool:
bash "${CLAUDE_SKILL_DIR}/scripts/ssh-env-check.sh" && echo "OK env-check" || echo "FAILED env-check"
STOP if FAILED -- fix SSH environment before continuing.
Parse output key=value pairs. Note available keys and ssh-agent status.
Load Existing Server Config
EXECUTE using Bash tool:
bash "${CLAUDE_SKILL_DIR}/scripts/claude-local-ops.sh" list 2>/dev/null || echo "NO_SERVERS"
Branching logic:
| Condition | Action |
|---|---|
NO_SERVERS AND mode=setup | GOTO Phase 2: Connection Setup |
NO_SERVERS AND mode=execute/connect | GOTO Phase 2 (need server first) |
1 server AND mode=connect/execute | Use as default, GOTO Phase 5 |
Multiple servers AND mode=connect/execute | AskUserQuestion: which server? Then GOTO Phase 5 |
mode=setup (servers exist) | GOTO Phase 2 (adding new server) |
mode=configure | AskUserQuestion: which server? Then GOTO Phase 5 |
mode=update-agent | GOTO Mode: update-agent |
Phase 2: Connection Setup
Step 1: Gather Connection Info
Use AskUserQuestion:
header: "SSH Server Setup"
question: "Provide connection details for the new server."
Collect via follow-up questions if not in $ARGUMENTS:
- Host (IP or hostname) -- REQUIRED
- User (default: deploy) -- REQUIRED
- Port (default: 22) -- optional
- Server name (short alias, e.g., vps-main) -- REQUIRED
Step 2: Key Discovery & Auth
EXECUTE using Bash tool -- try existing keys:
bash "${CLAUDE_SKILL_DIR}/scripts/ssh-env-check.sh" && echo "OK keys" || echo "FAILED keys"
Parse available keys from output. Try connection with each key (ed25519 first, then rsa, then ecdsa):
EXECUTE using Bash tool:
ssh-keyscan -p PORT HOST >> ~/.ssh/known_hosts 2>/dev/null && echo "OK keyscan" || echo "FAILED keyscan"
Replace PORT and HOST with actual values.
EXECUTE using Bash tool -- test key auth:
ssh -o BatchMode=yes -o ConnectTimeout=10 -p PORT USER@HOST echo "OK auth" 2>/dev/null || echo "FAILED auth"
Step 3: If Key Auth Fails
Use AskUserQuestion:
header: "SSH Authentication"
question: "Key authentication failed. Choose auth method:"
options:
- label: "Password login (will set up key auth)"
description: "Connect with password, then install SSH key"
- label: "Specify key path"
description: "Provide path to an existing private key"
- label: "Cancel"
description: "Abort server setup"
If password login:
- Generate dedicated key:
EXECUTE using Bash tool:
ssh-keygen -t ed25519 -f ~/.ssh/id_ed25519_SERVERNAME -N "" -C "claude@SERVERNAME" && echo "OK keygen" || echo "FAILED keygen"
Replace SERVERNAME with the server name alias.
- Instruct user to copy key manually:
Interactive command required. Run this in your terminal:
! ssh-copy-id -i ~/.ssh/id_ed25519_SERVERNAME.pub -p PORT USER@HOSTThis requires password entry which Claude Code cannot do non-interactively.
- After user confirms, verify:
EXECUTE using Bash tool:
ssh -o BatchMode=yes -o ConnectTimeout=10 -i ~/.ssh/id_ed25519_SERVERNAME -p PORT USER@HOST echo "OK key-auth" 2>/dev/null || echo "FAILED key-auth"
STOP if FAILED -- key auth must work before proceeding.
Step 4: SSH Config Entry
EXECUTE using Bash tool:
grep -q "^Host SERVERNAME$" ~/.ssh/config 2>/dev/null && echo "EXISTS" || echo "NEW"
If NEW, add config entry using Edit/Write to ~/.ssh/config:
Host SERVERNAME
HostName HOST
User USER
Port PORT
IdentityFile ~/.ssh/id_ed25519_SERVERNAME
StrictHostKeyChecking accept-new
Step 5: Final Connection Test
EXECUTE using Bash tool:
ssh -o BatchMode=yes -o ConnectTimeout=10 SERVERNAME echo "OK connection" 2>/dev/null || echo "FAILED connection"
STOP if FAILED -- connection must work before discovery.
Phase 3: Server Discovery
EXECUTE using Bash tool:
bash "${CLAUDE_SKILL_DIR}/scripts/server-discover.sh" "USER@HOST" PORT && echo "OK discovery" || echo "FAILED discovery"
Replace USER@HOST and PORT with actu