Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Horrific comparison point. LLM inference is way more expensive locally for single users than running batch inference at scale in a datacenter on actual GPUs/TPUs.


How is that horrific? It sets an upper bound on the cost, which turns out to be not very high.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: