Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

yeah the json token counts are super misleading. i run a bunch of claude agents for automation and like 85% of input tokens end up being cached reads -which cost 1/10th of the sticker price. so your $200k number is probably closer to $25-30k in real cost, and thats before you factor in that anthropics own infra is way cheaper than retail API pricing. the $5k forbes number was always nonsense but even the "corrected" estimates in TFA are probably still too high IMO
 help



I proxy all of my llm completion subscriptions. In a typical 7d span-

model completions read write cached_read cache_write

claude-opus-4-6 11000 16900000 5840000 1312000000 66120000


17M uncached reads (input) and 6M of uncached writes (output) is

  $5x17+$25x6=$235 for Opus 4.6

  $2x17+$12x6=$106 for Gemini 3 Pro

  $0.60x17+$3.6x6=$31.80 for Qwen3.5 397B-A17B via Huggingface API

You did not add up cache writes, which are $6.25 / MTok, which is another ~$400



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: