← Back to catalog
Published skills
experiment-set-design
0
Use when designing experiments to test whether a Claude Code skill is effective, or when planning how to validate a new or improved skill
Desenvolvimento#ai#testby jhhuh
blind-skill-assessment
0
Use when comparing two versions of agent output to determine which is better, or when evaluating whether a skill produces higher quality results than baseline
Desenvolvimento#aiby jhhuh
iterative-skill-refinement
0
Use when a skill exists but blind assessment shows it underperforms, or when iteratively improving a skill through experiment cycles while avoiding overfitting to fixed benchmarks
Desenvolvimento#aiby jhhuh
Category alert