Red-Team Eval Authoring
When To Use
- Adding a new red-team plugin or grader.
- Editing attack templates, rubric tags, or plugin metadata.
- Reviewing multimodal or tool-use safety evals for false positives/negatives.
Requirements / Checks
- Confirm the target eval framework and repo layout before editing.
- Prefer deterministic shape checks for templates before adding model-graded rubrics.
- Ask before running networked evals, paid model graders, or large red-team suites.
Workflow
[Description truncada. Veja o README completo no GitHub.]