Skill

Evaluation Loops

Systematic checks that guide iteration across code, UX, and agent behavior.

Evaluation loops make research claims inspectable by connecting each change to evidence gathered from tests, builds, traces, screenshots, or review notes.