The strix halo announce was pretty much exactly what was leaked.
However one big surprise was that the Halo 395 chip runs Llama 3.1 70B-Q4 2.2x times faster than a RTX 4090 24GB. Anyone have any details? The slide mentions seeing AMD endnote SHO-14 for details.
It mentioned Q4, but after searching around a bit looks like 70B-Q4 need 35GB or so. So strix halo is 2.2x faster than a 4090 when it's paging to system ram.
However one big surprise was that the Halo 395 chip runs Llama 3.1 70B-Q4 2.2x times faster than a RTX 4090 24GB. Anyone have any details? The slide mentions seeing AMD endnote SHO-14 for details.
Maybe 70B-Q4 doesn't fit in 24GB?