But there's no such thing as compute cost in the abstract. What exactly is compute cost for AI? Does it include:
• Inference used for training? Modern training pipelines aren't just gradient descent, there's a ton of inference used in them too.
• Gradient descent itself?
• The CPUs and disks storing and managing the datasets?
• The web servers?
• The people paid to swap out failed components at the dc?
Let's say you try and define it to mean the same as unit economics - what does it cost you to add an additional customer vs what they bring in. There's still no way to do this calculation. It's like trying to compute the unit economics of a software company. Sure, if you ignore all the R&D costs of building the software in the first place and all the R&D costs of staying competitive with new versions, then the unit economics look amazing, but there's still plenty of loss-making software startups in the world.
Unit economics are a useful heuristic for businesses where there aren't any meaningful base costs required to stay in the game because they let you think about setup costs separately. Manufacturing toys, private education, farming... lots of businesses where your costs are totally dominated by unit economics. AI isn't like that.
Gross margins and cost of revenue are well defined accounting terms that apply to any type of business.
> Does it include:
> Inference used for training? Modern training pipelines aren't just gradient descent, there's a ton of inference used in them too.
No because this is training and not inference. Just like how R&D costs for a drug aren't part of COGS either.
> Gradient descent itself?
No
> The CPUs and disks storing and managing the datasets?
Yes
> The web servers?
Yes
> The people paid to swap out failed components at the dc?
Yes to the extent they are swapping for inference and not training. If the same employees do both then the accountants will estimate what percent of their time is dedicated to each and adjust their cost accordingly.
We weren't talking about COGS, we were talking about "cost of compute", which isn't an accounting term.
For the rest, anyone can define and apply an accounting metric but that doesn't mean it tells you anything useful. If you look at the unit cost of any typical IP business it's nearly zero. Yet, many companies lose money on making movies, video games, apps and books.
I'm not familiar with accounting, but I suspect a lot of these cloud infrastructure companies don't throw out hardware for a very long time, just like how AWS sells you their old stuff as whitelabel compute at a markup, behind which I think are mostly old pieces of hardware, I think as long as Anthropic keeps finding uses for the old GPUS provided they dont break, they don't have to write off these assets, which means they don't incur costs using them if they are clever with their books
The marginal cost of the next token. That can include the power, the operating cost of the facility, repair costs, etc.
The API price should hopefully incorporate the capitalized cost of the hardware, the facility rent, the cost to train the model, the r&d, cost of sales, etc., to make it profitable.
Claude Code Max may be able to offer a good price by having a mix of higher and lower utilization of users and ignoring the fixed costs, treating it as a driver of API sales. But it doesn't make sense to essentially pay people to use it.
Your point is that there are more relevant quantities to calculate for checking economic viability is fair, but that doesn't negate the "cost of inference" being an interesting metric in itself.
• Inference used for training? Modern training pipelines aren't just gradient descent, there's a ton of inference used in them too.
• Gradient descent itself?
• The CPUs and disks storing and managing the datasets?
• The web servers?
• The people paid to swap out failed components at the dc?
Let's say you try and define it to mean the same as unit economics - what does it cost you to add an additional customer vs what they bring in. There's still no way to do this calculation. It's like trying to compute the unit economics of a software company. Sure, if you ignore all the R&D costs of building the software in the first place and all the R&D costs of staying competitive with new versions, then the unit economics look amazing, but there's still plenty of loss-making software startups in the world.
Unit economics are a useful heuristic for businesses where there aren't any meaningful base costs required to stay in the game because they let you think about setup costs separately. Manufacturing toys, private education, farming... lots of businesses where your costs are totally dominated by unit economics. AI isn't like that.