Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The strix halo announce was pretty much exactly what was leaked.

However one big surprise was that the Halo 395 chip runs Llama 3.1 70B-Q4 2.2x times faster than a RTX 4090 24GB. Anyone have any details? The slide mentions seeing AMD endnote SHO-14 for details.

Maybe 70B-Q4 doesn't fit in 24GB?



Last time I played with it, 70b models are much larger than 24gb without a lot of quantization.


It mentioned Q4, but after searching around a bit looks like 70B-Q4 need 35GB or so. So strix halo is 2.2x faster than a 4090 when it's paging to system ram.

Not so impressive 8-(.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: