Execute a task with automated verification. Classifies the goal, builds the cheapest possible verifier (deterministic test, regex/command check, LLM judge, or a hybrid of these), runs the work, and iterates until the verifier passes or a hard cap is hit. Use when a task has a checkable success criterion and you want confidence the output actually meets it — e.g. "write a function that does X", "dr
Desenvolvimento#llm#testpor klittle32