This is great, but i guess they are feeling the heat from Codex resetting limits...

Analemma_ · 2026-03-14T21:16:34 1773522994

The insanely competitive market for LLMs is great for us, but if I were one of the investors in these companies it wouldn't exactly fill me with confidence that my $500 billion spent on datacenters and Nvidia cards is going to get repaid ten times over like they're claiming. I'm still getting very strong "this is a commodity; margins will be driven inexorably to zero" vibes from these products.

stavros · 2026-03-14T21:15:34 1773522934

I think they're feeling the heat from growing too quickly so they want to incentivize people to spread the load more evenly.

toomuchtodo · 2026-03-14T21:38:35 1773524315

Very much like electric utility time of day pricing, using economic incentives to shift demand to trough periods.

Perhaps an opportunity for them to improve workload scheduling orchestration, like submitting a job to a distributed computing cluster queue, to smooth demand and maximize utilization.

stavros · 2026-03-14T21:44:36 1773524676

Everything bursty will use economic incentives to smooth the load. I'm not sure how they'd do that with workload scheduling orchestration when you have latency-sensitive loads and there are e.g. twice as many requests at midday as at midnight.

toomuchtodo · 2026-03-14T22:10:39 1773526239

You decouple the workloads from human interaction (ie when you submit the job to the queue vs when it is scheduled to execute) so when they run is not a consideration, if possible. The economic incentives encourage solving this, and if it can’t be solved, it buckets customer cohort by willingness (or unwillingness) to pay for access during peak times.

stavros · 2026-03-14T22:13:30 1773526410

Sure, but if I ask the LLM a question, I'd like it to respond now, instead of tonight.

toomuchtodo · 2026-03-14T22:20:56 1773526856

Certainly, interactive workloads aren’t realistic for time shifting, but agentic coding likely is. Package everything up and ship it as a job, getting a bundle back asynchronously.

stavros · 2026-03-14T22:23:58 1773527038

I don't know, my agentic coding is pretty interactive. Maybe once the plan is done, sure. That would be interesting, though OpenAI already does this with batch workloads.