Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Arguably DRAM-based GPUs/TPUs are quite inefficient for inference compared to SRAM-based Groq/Cerebras. GPUs are highly optimized but they still lose to different architectures that are better suited for inference.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: