More

shafyy · 2026-03-31T13:40:44 1774964444

What is the best way to get start with open weight models? And are they a good alternative to Claude Code?

lukewarm707 · 2026-03-31T13:56:51 1774965411

i would recommend getting an API account on fireworks, this is ZDR and typically the fastest provider.

otherwise check the list of providers on openrouter and you can see the pricing, quantisation, sign up directly rather than via a router. ensure to get caching prices, do not get input/output API prices.

GLM 5 is a frontier model, Kimi 2.5 is similar with vision support, Minimax M2.7 is a very capable model focused on tool calling.

If you need server side web search, you could use the Z AI API directly, again ZDR; or Friendli AI; or just install a search mcp.

For the harness opencode is the normal one, it has subagents and parallel tool calling; or just use claude code by pointing it at the anthropic APIs of various providers like fireworks.

MarsIronPI · 2026-03-31T13:50:21 1774965021

If you want to still use APIs, I like OpenRouter because I can use the same credits across various models, so I'm not stuck with a single family of models. (Actually, you can even use the proprietary models on OpenRouter, but they're eye-wateringly expensive.)

Otherwise you should look into running e.g. Qwen3.5-35B-A3B or Qwen3.5-27B on your own computer. They're not Opus-level but from what I've heard they're capable for smaller tasks. llama.cpp works well for inference; it works well on both CPU and GPUs and even split across both if you want.

wolvoleo · 2026-03-31T13:47:25 1774964845

Just install ollama.

And no, they're not as capable as SOTA models. Not by far.

However they can help reduce your token expenditure a lot by routing them the low-hanging fruit. Summaries, translations, stuff like that.

ramon156 · 2026-03-31T13:55:19 1774965319

no need for ollama, simonw's llm tool is good enough

scottcha · 2026-03-31T14:00:21 1774965621

We offer multiple SOA models at https://portal.neuralwatt.com at very generous pricing since we have options to bill per kWh instead of per token. Recipes for your favorite tools here: https://github.com/neuralwatt/neuralwatt-tools

shafyy · 2026-03-30T13:16:25 1774876585

Not counting training models as part of your gross margin is just creative accounting. It's an inherent part of being able to provde the service for OpenAI, Anthropic etc.

Even so, their subscriptions are significantly cheaper than the token pricing via API. So at some point they will need to get rid of subscriptions or increase the subscription prices dramatically... And that's assuming their current token pricing is actually profitable. Which it probably isn't.

Lastly, I would not trust one word that comes out of an executive of an AI company (or any other large company, for that matter).

shafyy · 2026-03-30T10:07:43 1774865263

Fun fact: Editors are usually also people. Except for that one dog I met during a cold winter's day in 1987 in a run-down London pub.

incognito124 · 2026-03-30T10:21:03 1774866063

On the internet, no one knows you're an editor

scorpionfeet · 2026-03-30T22:54:58 1774911298

Can you explain the reference? It whooshed me.

shafyy · 2026-03-31T13:32:14 1774963934

Haha, no reference. I just made something up and tried to be funny.

Mentlo · 2026-03-31T15:27:18 1774970838

I tried figuring out the reference with Gemini, and it said this:

The immediate reply to that comment is: "On the internet, no one knows you're an editor." This is a direct play on the famous 1993 New Yorker cartoon: "On the Internet, nobody knows you're a dog." By setting the anecdote in 1987 (a few years before the World Wide Web was publicly available), the commenter is implying that back in the analog days, if a dog wanted to be a writer or an editor, they couldn't hide behind a screen—they had to sit in a smoky London pub and do business face-to-face.

Which makes a lot of sense actually. I would imagine that's what the replier to you thought you meant.

shafyy · 2026-03-31T17:24:50 1774977890

Hahaha great story. But that's not what I had I thought about at all.

taneliv · 2026-03-30T16:49:39 1774889379

No way, bro! I'm no longer an editor, though.

shafyy · 2026-03-30T07:48:42 1774856922

Because people using LLMs get lazy and can't event type normal text themselves anymore.

shafyy · 2026-03-28T15:00:41 1774710041

Do you own one of these? If yes, are you happy with the quality?

mixermachine · 2026-03-28T21:07:56 1774732076

No I don't own them. Some people here: https://www.mydealz.de/share-deal-from-app/2753917 mention as a downside that you have to connect to the manufacturer cloud to get the live production data from the interverter. Otherwise it is truly "plug everything together and it works".

shafyy · 2026-03-30T08:32:23 1774859543

Ok thanks!

shafyy · 2026-03-28T14:10:17 1774707017

Well explained, thank you! This is called the marginal price. Many people are not aware of this and are then think renewables are expensive.

shafyy · 2026-03-27T14:04:12 1774620252

I feel like 4.6 is worse than 4.5 lol

g19fanatic · 2026-03-27T14:56:26 1774623386

i actually agree for opus... but sonnet 4.6 is like magic ime

jmaker · 2026-03-27T17:07:48 1774631268

Whatever I used Sonnet 4.6 for, including Claude Code and Claude Chat, it made so many mistakes and totally awkward assumptions that I can’t fathom what it’s supposed to be good at. The mistakes were so blatant. Plan mode, several passes, couple grand in API costs… just disappointing at every task in every session over the past few weeks. Opus 4.6 has been good, still quite a few unexpected, silly mistakes, a few subtle but critical mistakes, but produced workable increments and code reviews, vastly subpar to GPT-5.x in chat mode (with and without identical customization).

shafyy · 2026-03-27T13:22:18 1774617738

Haha yeah... Wait until they start jacking up the subscription prices

observationist · 2026-03-27T15:20:55 1774624855

They don't change the prices, they just modify the amount of compute allocated - slower speeds and fewer tokens, they can set everything in the background to optimize costs and returns, and the user never realizes anything has changed.

Sometimes they'll announce the changes, and they'll even try to spin it as improving services or increasing value.

Local AI capabilities are improving at a rapid pace, at some point soon we'll have an RWKV or a 4B LLM that performs at a GPT-5 level, with reasoning and all the bells and whistles, and hopefully that'll shake out most of the deceptive and shady tactics the big platforms are using.

shafyy · 2026-03-28T14:43:59 1774709039

> They don't change the prices, they just modify the amount of compute allocated - slower speeds and fewer tokens, they can set everything in the background to optimize costs and returns, and the user never realizes anything has changed.

I can't imagine that this is the way it will go... Tokens haven't been getting cheaper for flagship models, have they? You already see something closer to their real cost if you compare e.g. the Claude subscriptions to their actual token pricing.

> Local AI capabilities are improving at a rapid pace, at some point soon we'll have an RWKV or a 4B LLM that performs at a GPT-5 level, with reasoning and all the bells and whistles, and hopefully that'll shake out most of the deceptive and shady tactics the big platforms are using.

Maybe, but LLMs are scale game, and data center will always be more capable than your local device. So, you will always be getting a worse version locally. Or do you think we'll LLMs in data centers stop getting better and local LLMs will somehow catch up?

shafyy · 2026-03-27T09:46:47 1774604807

Haha sure, let's just let every user add their feedback to the software.

shafyy · 2026-03-24T15:36:25 1774366585

You can't just look at real median income. You also need to look at other factors like wealth inequality, housing prices and health care.

Antoniocl · 2026-03-24T16:56:55 1774371415

Well, I think in the context of the parent comment, separating out housing would risk overstating changes in its effect on purchasing power because increases to housing would already be captured by inflation (since we're talking about real median income, which is already inflation adjusted)

I agree that housing affordability is a major problem and that looking at it independently could help you quantify if housing specifically has become more unaffordable, but that's a different question then whether the median person's overall purchasing power has declined (considering all of housing, healthcare food etc)

shafyy · 2026-03-25T13:27:18 1774445238

Yes housing prices are captured by inflation, but since housing prices are so different from region to region and city to city, just looking at a country-wide average median income does not say much.

bloppe · 2026-03-24T16:13:47 1774368827

housing prices and health care are captured by the "real" part of "median real income". Inequality is not, but as long as everybody is getting richer, I'm less concerned about inequality.

shafyy · 2026-03-25T13:22:53 1774444973

Inequality is still important even if everybody is getting richer (I disagree that everybody is getting richer, but that's a different discussion). Because wealth and power concentrated in the hand of a few means that a few have outsized influence on politics and society, which is very much eroding the fundaments of how a democracy is supposed to work.

Furthermore, there are many studies showing that more wealth inequality results in more consumption in a society. Which is not good for many reasons.