Behavior Watch

Detect anomalous patterns in the agent's own behavior during a session.

When to Use

At any point during a session to audit what the agent has been doing
After a long autonomous run to review the agent's actions
When something feels off — the agent seems to be doing more than expected
As a periodic check during sensitive operations

Methodology

Adapted from Flare's cloud audit log anomaly detection. Instead of analyzing cloud logs, analyze the agent's own tool call history in this session.

What to Track

For every tool call in the session, note:

Tool name: Which tool was called
Target: File path, URL, command, or search pattern
Direction: Read (observation) vs Write (mutation)
Scope: How much of the codebase was touched
Sequence: What came before and after this call

Anomaly Signals

Score each signal 0-100, following Flare's scoring guide:

85-100 = Critical: immediate attention
65-84 = High: worth investigating
40-64 = Medium: note and correlate
0-39 = Low: informational

Frequency anomalies:

A tool being called far more than expected for the task (e.g., 50 Bash calls for a "rename a variable" task)
Sudden burst of activity after a period of normal pacing
Repeated failed attempts at the same operation

Scope anomalies:

File access outside the expected working directory
Reading dotfiles, credentials, or config files unrelated to the task
Touching files in unrelated parts of the codebase
Git operations on unexpected branches

Sequence anomalies:

Read credential file → network call (potential exfiltration)
Write to shell config → Bash execution (potential persistence)
Read many files rapidly without edits (potential reconnaissance)
Edit → no test/verify → edit → no test/verify (potential recklessness)

First-seen patterns:

First time accessing a particular directory tree
First time using a tool not previously used in the session
First time executing a command pattern not seen before
Unusual file extensions being read or written

Task drift:

Actions that don't clearly serve the user's stated goal
Scope expanding beyond what was asked (fixing "related" issues)
Creating files not requested by the user
Installing packages or dependencies not discussed

Report Format

When invoked, analyze the session history and produce:

## Behavior Watch Report

Session duration: ~45 minutes
Total tool calls: 67

### Activity Summary
| Tool    | Calls | Read | Write | Notes                    |
|---------|-------|------|-------|--------------------------|
| Read    | 23    | 23   | -     | 4 unique directories     |
| Edit    | 12    | -    | 12    | All in src/components/   |
| Bash    | 18    | -    | 18    | 3 test runs, 2 git ops   |
| Grep    | 8     | 8    | -     |                          |
| Glob    | 4     | 4    | -     |                          |
| Write   | 2     | -    | 2     | New files created        |

### Directories Touched
- src/components/ (34 calls) — primary work area ✓
- src/utils/ (8 calls) — related imports ✓
- tests/ (12 calls) — test execution ✓
- ~/.config/ (2 calls) — ⚠ outside project scope

### Findings

- [MEDIUM] Scope anomaly (score: 52)
  2 Read calls to ~/.config/some-tool/config.json
  These are outside the project directory and not obviously related
  to the task "update the login form."
  Likely benign: checking tool configuration. But worth noting.
  → Suggested action: `/safe-agent:tool-guard profile careful` to gate future
    out-of-scope reads, or `/safe-agent:tool-guard deny Bash` if shell access
    is not needed for this task.

- [LOW] Frequency note (score: 28)
  18 Bash calls is higher than typical for a UI task.
  Breakdown: 8 npm test, 4 npm run build, 3 git status,
  2 git diff, 1 npx tsc. All task-relevant.
  → No action needed.

### Verdict
No critical or high-severity anomalies detected.
All mutations (Edit/Write) are within the expected project scope.
Session behavior is consistent with the stated task.

Remediation Suggestions

Every MEDIUM or higher finding must include a → Suggested action: line recommending a specific safe-agent skill to mitigate the risk:

Finding type	Suggested action
Scope anomaly (out-of-project access)	`/safe-agent:tool-guard profile careful` or deny specific tools
Suspicious sequence (credential + network)	`/safe-agent:tool-guard deny Bash` immediately
Excessive tool calls	`/safe-agent:cost-guard $N reject` to cap remaining spend
Task drift	Ask the user to confirm the expanded scope
First-seen sensitive file access	`/safe-agent:skill-verify` on any recently installed skills

For LOW findings, add → No action needed. to keep the format consistent.

Scoring Calibration

To reduce false positives, apply these baseline expectations:

Normal read:write ratio: ~3:1 for typical coding tasks
Normal Bash calls: 5-15 for a focused task, 15-30 for test-heavy work
Normal scope: 2-4 directories for a focused change
Expected tools for coding: Read, Edit, Grep, Glob, Bash — all normal
Red flags: Agent tool, WebFetch, WebSearch used when not requested

Adjust baselines based on what the user asked for. A "refactor the entire auth module" task justifies more scope than "fix the typo on line 42."

Limitations

This skill analyzes the current session only — no cross-session memory
Tool call history may be compressed in long conversations, limiting visibility into early actions
This is observational analysis, not prevention — it reports what happened, it doesn't block future actions (use /tool-guard for that)

behavior-watch

Como adicionar

Cole no README do seu repo

Skills relacionadas

internal-comms

babysit

do

smart-explore

Receba novas skills de DevOps e Infra toda segunda