promptdojo_

Why most beginner agents die in production — and how to pick one that ships — step 6 of 7

Write score_project(spec) that takes a project spec (dict) and returns a dict with two fields:

  • score: integer 0-100, higher is better
  • verdict: string, one of:
    • "ship it" if score >= 80
    • "narrow the scope" if score >= 60
    • "pick something else" if score >= 40
    • "don't build this" if score < 40

Score the spec on the five wedge signals from step 04. Each signal that PASSES adds 20 points:

  • volume_per_day >= 10: +20 (volume signal)
  • output_schema_defined: +20 (structured outcome signal)
  • has_rubric: +20 (explicit rules signal)
  • users_count >= 1 and users_count <= 10: +20 (single user/team signal — exactly one team, not "everyone")
  • eval_method_defined: +20 (observable success signal)

Two projects run. Expected output:

PR review agent: {'score': 100, 'verdict': 'ship it'}
AI life coach:   {'score': 0, 'verdict': "don't build this"}

full-screen editor opens — close anytime to keep reading.