It isn't going to replace cloud LLMs since cloud LLMs will always be faster in t...

raincole · 2026-03-31T06:06:25 1774937185

Yep. People were claiming DeepSeek was "almost as good as SOTA" when it came out. Local will always be one step away like fusion.

It's just wishful thinking (and hatred towards American megacorps). Old as the hills. Understandable, but not based on reality.

kortilla · 2026-03-31T07:44:11 1774943051

Don’t try to draw trend lines for an industry that has existed for <5 years.

virtue3 · 2026-03-31T05:28:07 1774934887

We are 100% there already. In browser.

the webgpu model in my browser on my m4 pro macbook was as good as chatgpt 3.5 and doing 80+ tokens/s

Local is here.

AndroTux · 2026-03-31T06:20:01 1774938001

Sir, ChatGPT 3.5 is more than 3 years old, running on your bleeding edge M4 Pro hardware, and only proves the previous commenters point.

AugSun · 2026-03-31T06:35:15 1774938915

It works really well for "You're helpful assistant / Hi / Hello there. how may I help you today?" Anything else (esp in non-EN language) and you will see the limitations yourself. just try it.

mirekrusin · 2026-03-31T06:52:46 1774939966

Local RTX 5090 is actually faster than A100/H100.

aurareturn · 2026-03-31T07:53:24 1774943604

It's a $4,000 GPU with 32GB of VRAM and needs a 1,000 watt PSU. It's not realistic for the masses.

If it has something like 80GB of VRAM, it'll cost $10k.

The actual local LLM chip is Apple Silicon starting at the M5 generation with matmul acceleration in the GPU. You can run a good model using an M5 Max 128GB system. Good prompt processing and token generation speeds. Good enough for many things. Apple accidentally stumbled upon a huge advantage in local LLMs through unified memory architecture.

Still not for the masses and not cheap and not great though. Going to be years to slowly enable local LLMs on general mass local computers.

mirekrusin · 2026-03-31T21:13:09 1774991589

Yes, it’s expensive hobby.

fredoliveira · 2026-03-31T15:19:55 1774970395

Crazy thing to say without other contextual information - it obviously depends on a number of factors. Do you have an apples to apples comparison at hand?

mirekrusin · 2026-03-31T21:13:55 1774991635

Look it up.

AugSun · 2026-03-31T05:19:10 1774934350

Looking at downvotes I feel good about SDE future in 3-5 years. We will have a swamp of "vibe-experts" who won't be able to pay 100K a month to CC. Meanwhile, people who still remember how to code in Vim will (slowly) get back to pre-COVID TC levels.

QuantumNomad_ · 2026-03-31T05:38:13 1774935493

What is CC and TC? I have not heard these abbreviations (except for CC to mean credit card or carbon copy, neither of which is what I think you mean here).

Ericson2314 · 2026-03-31T05:45:41 1774935941

I figured it out from context clues

CC: Claude Code

TC: total comp(ensation)

AugSun · 2026-03-31T06:30:01 1774938601

Thank you for clarifying! (I had no idea it needs to be explained, sorry.)

hrmtst93837 · 2026-03-31T06:31:39 1774938699

[flagged]

aurareturn · 2026-03-31T07:06:22 1774940782

Yea I get that there will always be demand for local waifus. I never said local LLMs won't be a thing. I even said it will be a huge thing. Just won't replace cloud.