Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Good point on the green/red dashboard. The opportunity cost angle is worth adding though. A failed run isn't just the wasted tokens and retry cost - it's also the task that didn't get done and the engineering required to diagnose why. On anything time-sensitive, that compounds fast.


Exactly. At the moment it's close enough to be a wash for some cases, or tilts seriously one direction or other for others. I expect improved harnesses means more and more we'll just be able to re-run a couple of times, and fall back to "escalating" to Sonnet or even Opus, but whenever it involves egineering time, that's a big deal.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: