TL;DR the premise for the calculation is completely flawed even if the conclusion is correct
> Qwen 3.5 397B-A17B is a good comparison point. It's a large MoE model, broadly comparable in architecture size to what Opus 4.6 is likely to be.
I stopped reading here. Frontier models have been rumoured to be in TRILLIONS of parameters since the days of GPT-4. Besides, with agents, I think they’re using more specialized models under the hood for certain tasks like exploration and web searches.
So while their cost won’t be $5000 or anywhere close, I still think it would be in the hundreds for heavy users. They may very well be losing money to the top 5-10% MAX users. Their real margin likely comes from business API customers.
Here’s an interesting bit - OpenAI filed a document with the SEC recently that gave us a peek into its finances. The cost of all infrastructure stood at just ~30% of all revenue generated. That is a phenomenal improvement. I fell off the chair when I first learned that.
> Qwen 3.5 397B-A17B is a good comparison point. It's a large MoE model, broadly comparable in architecture size to what Opus 4.6 is likely to be.
I stopped reading here. Frontier models have been rumoured to be in TRILLIONS of parameters since the days of GPT-4. Besides, with agents, I think they’re using more specialized models under the hood for certain tasks like exploration and web searches.
So while their cost won’t be $5000 or anywhere close, I still think it would be in the hundreds for heavy users. They may very well be losing money to the top 5-10% MAX users. Their real margin likely comes from business API customers.
Here’s an interesting bit - OpenAI filed a document with the SEC recently that gave us a peek into its finances. The cost of all infrastructure stood at just ~30% of all revenue generated. That is a phenomenal improvement. I fell off the chair when I first learned that.