eval-driven ai development8 / 9
assertions on ai output, not vibes
Build a small eval runner. The starter has a list of cases (each with
name, input, and expected) and a stub classify(text) that
returns "yes" or "no" based on whether the text contains
"cancel". Write a function run_suite(cases) that:
- Calls
classify(case["input"])for each case. - Compares the lowercased actual to the lowercased expected.
- Returns a tuple
(passed, total).
Then call run_suite(cases) and print the result as
passed/total.
Expected output:
3/3
⌘↵ runs the editor.
Booting Python…
Output
[promptdojo:~]$ _