More

rudedogg · 2026-03-27T05:09:10 1774588150

Their rule of only releasing major software updates once a year in June is holding them back IMO. Their local LLM apis were dated before macOS/iOS 26 was even released. Just because something worked 20 years ago doesn’t mean it works today, but I’m sure it’s hard to argue against a historically successful strategy internally.

aurareturn · 2026-03-27T07:31:32 1774596692

Huh? What local LLM apis? It uses Metal.

rudedogg · 2026-03-27T19:23:28 1774639408

The application development APIs, ie: https://developer.apple.com/documentation/technologyoverview...

aurareturn · 2026-03-29T09:11:17 1774775477

Any serious LLM work isn't going to use that. They'll use Metal GPU. No one is going to inference using the NPU on a Mac.

OP said "work stations" which is implying Macbook Pros and Studios.

rudedogg · 2026-04-02T22:24:16 1775168656

> Any serious LLM work isn't going to use that.

That’s my point.

One would expect the platform owner (especially one where they own both the hardware AND software) to provide a reasonable / easy path to using LLMs if they are going to provide a framework for doing so. But Apple can’t because of how slow they ship updates

rudedogg · 2026-03-27T02:00:08 1774576808

> You don't have to use all of them!

You sure pay for the language complexity in high compile times though. Swift is slow, like really slow. I’ve been with it since like v1.2, and its been getting progressively worse for a while IMO. Complex language features (Lets do a borrow checker! Lets do embedded!) and half of the shit isn’t being used internally as far as I can tell

rudedogg · 2026-03-19T21:46:43 1773956803

This is more like if Google took action against Thunderbird and open-source email clients

wavemode · 2026-03-19T22:00:14 1773957614

No, because in those cases you're still a user of gmail. When you tell people your email address, or send people email, and it contains "@gmail.com", you're still implicitly advertising for Google. From Google's perspective that's still worth the few KB per day of bandwidth and 1GB storage (which the vast majority of people never use the entirety of, anyway) they're giving away.

But when you use gmail accounts as file storage, you're both a higher-cost user and also doing nothing to further Google's ecosystem (since the email address itself is probably not being used for genuine messaging at all).

esperent · 2026-03-19T23:33:41 1773963221

And here, you're still using Claude Opus, and when people ask you what you used, you'd say OpenCode client with Claude (Thunderbird client with Gmail).

As analogies go it's pretty close.

kaycey2022 · 2026-03-20T04:12:23 1773979943

there is nothing about claude code that prevents you from using it for non coding use cases. nothing that happens in open code or any harness for that matter is hidden from anthropic. neither does open code allow access in some nefarious use case that claude code does not.

the difference is not like the difference between gmail and gmailfs like you seem to be misunderstanding. a more accurate comparison would be the difference between curl, or httpie vs postman.

Aurornis · 2026-03-19T22:09:49 1773958189

It's not analogous at all because Google intentionally provided interfaces for those clients and even instructions for using them.

An analogous situation would be if someone reverse engineered the Google Maps API and provided their own app that showed maps using the Google Maps data.

rudedogg · 2026-03-19T22:20:03 1773958803

And if Google Maps charged per tile viewed, so the user pays the same amount regardless of which maps client they used, would your opinion hold?

I get that it’s a ToS violation, but I’m saying it shouldn’t be. They’re trying to make the harness the moat because they all have no moat.

Aurornis · 2026-03-19T22:26:05 1773959165

> And if Google Maps charged per tile viewed, so the user pays the same amount regardless of which maps client they used, would your opinion hold?

Yes. Why wouldn't it hold?

Anthropic has a pay-per-token API. You can use OpenCode with it.

Maybe my consistency comes from having worked with contracts and agreements in the real world, where the end user doesn't get to pick and choose which terms they want to abide by.

When you sign up to use a service, you're not signing up to use it however you would like, on your own terms. You're paying for a service that they offer. They are not obligated to continue offering it to you if you try to use it a different way.

Maxatar · 2026-03-19T22:36:18 1773959778

Anthropic has no issue with the use of OpenCode using Anthropic's API which does charge per token.

dpe82 · 2026-03-19T22:05:14 1773957914

Google explicitly allows third party email clients to work with Gmail, so no that hypothetical does not apply to this situation at all.

rudedogg · 2026-03-19T22:15:41 1773958541

My point is that model providers are just a compute service, and should have no say in what sends the data, or displays the data. Especially when they only bill based on the quantity of data.

They are basically a utility.

Aurornis · 2026-03-19T22:27:28 1773959248

They have an API for exactly that. You can use it.

They offer a separate plan with discounts for use with their tools. You can also choose to take advantage of those discounts with the monthly fee, within the domain where that applies. You cannot, however, expect to demand that discount to apply to anything you want.

You can argue about what you want it to be all day long, but when you go to the subscription page and choose what to purchase it's very clear what you're getting.

> They are basically a utility

Utilities like my electric company also have different plans for different uses. I cannot, for example, sign up for a residential plan and then try to connect it to my commercial business, even though I'm consuming power from them either way.

Utilities do not work like that. They do have contractual agreements about how you can use the resources provided.

rudedogg · 2026-03-13T00:09:23 1773360563

> And on top of it, if you develop for native macOS, There’s no official tooling for visual verification. It’s like 95% of development is web and LLM providers care only about that.

I think this is built in to the latest Xcode IIRC

rudedogg · 2026-03-12T23:22:57 1773357777

I think the real divide is over quality and standards.

We all have different thresholds for what is acceptable, and our roles as engineers typically reflect that preference. I can grind on a single piece of code for hours, iterating over and over until I like the way it works, the parameter names, etc.

Other people do not see the value in that whatsoever, and something that works is good enough. We both are valuable in different ways.

Also, theres the pace of advancement of the models. Many people formed their opinions last year, and the landscape has changed a lot. There’s also some effort requires in honing your skill using them. The “default” output is average quality, but with some coaxing higher quality output is easily attained.

I’m happy people are skeptical though, there are a lot of things that do require deep thought, connecting ideas in new ways, etc., and LLMs aren’t good at that in my experience.

allenu · 2026-03-13T00:07:30 1773360450

> I think the real divide is over quality and standards.

I think there are multiple dimensions that people fall on regarding the issue and it's leading to a divide based on where everyone falls on those dimensions.

Quality and standards are probably in there but I think risk-tolerance/aversion could be behind some how you look at quality and standards. If you're high on risk-taking, you might be more likely to forego verifying all LLM-generated code, whereas if you're very risk-averse, you're going to want to go over every line of code to make sure it works just right for fear of anything blowing up.

Desire for control is probably related, too. If you desire more control in how something is achieved, you probably aren't going to like a machine doing a lot of the thinking for you.

bandrami · 2026-03-13T07:15:01 1773386101

This. My aversion to LLMs is much more that I have low risk tolerance and the tails of the distribution are not well-known at this point. I'm more than happy to let others step on the land mines for me and see if there's better understanding in a year or two.

XenophileJKO · 2026-03-13T08:19:53 1773389993

I think there is more to it than that.

I am a high quality/craftsmanship person. I like coding and puzzling. I am highly skilled in functional leaning object oriented deconstruction and systems design. I'm also pretty risk averse.

I also have always believed that you should always be "sharpening your axe". For things like Java delelopment or things where I couldn't use a concise syntax would make extensive use of dynamic templating in my IDE. Want a builder pattern, bam, auto-generated.

Now when LLMs came out they really took this to another level. I'm still working on the problems.. even when I'm not writing the lines of code. I'm decomposing the problems.. I'm looking at (or now debating with the AI) what is the best algorithm for something.

It is incredibly powerful.. and I still care about the structure.. I still care about the "flow" of the code.. how the seams line up. I still care about how extensible and flexible it is for extension (based on where I think the business or problem is going).

At the same time.. I definately can tell you, I don't like migrating projects from Tensorflow v.X to Tenserflow v.Y.

skydhash · 2026-03-13T12:01:01 1773403261

> I'm looking at (or now debating with the AI) what is the best algorithm for something.

That line always makes me laugh. There’s only 2 points of an algorithm, domain correctness and technical performance. For the first, you need to step out of the code. And for the second you need proofs. Not sure what is there to debate about.

signatoremo · 2026-03-13T14:50:04 1773413404

Not true. There is also cost, money or opportunity. Correctness or performance isn't binary -- 4 or 5 nines, 6 or 7 decimal precision, just to name a few. That drives a lot discussion.

There may be other considerations as well -- licensing terms, resources, etc.

aleph_minus_one · 2026-03-13T13:10:04 1773407404

I think it's a little bit more complicated.

I, for example, would claim to be rather risk-tolerant, but I (typically) don't like AI-generated code.

The solution to the paradox this creates if one considers the model of your post is simple:

- I deeply love highly elegant code, which the AI models do not generate.

- I cannot stand people (and AIs) bullshitting me; this makes me furious. I thus have an insanely low tolerance for conmen (and conwomen and conAIs).

enraged_camel · 2026-03-13T00:07:25 1773360445

I think this is a false dichotomy because which approach is acceptable depends heavily on context, and good engineers recognize this and are capable of adapting.

Sometimes you need something to be extremely robust and fool-proof, and iterating for hours/days/weeks and even months might make sense. Things that are related to security or money are good examples.

Other times, it's much more preferable to put something in front of users that works so that they start getting value from it quickly and provide feedback that can inform the iterative improvements.

And sometimes you don't need to iterate at all. Good enough is good enough. Ship it and forget about it.

I don't buy that AI users favor any particular approach. You can use AI to ship fast, or you can use it to test, critique, refactor and optimize your code to hell and back until it meets the required quality and standards.

kaffekaka · 2026-03-13T06:37:58 1773383878

Yes, it is a false dichotomy but describes a useful spectrum. People fall on different parts of the spectrum and it varies between situations and over time as well. It can remind one that it is normal to feel different from other people and different from what one felt yesterday.

bigstrat2003 · 2026-03-13T07:02:37 1773385357

> Also, theres the pace of advancement of the models. Many people formed their opinions last year, and the landscape has changed a lot.

People have been saying this every year for the last 3 years. It hasn't been true before, and it isn't true now. The models haven't actually gotten smarter, they still don't actually understand a thing, and they still routinely make basic syntax and logic errors. Yes, even (insert your model of choice here).

The truth is that there just isn't any juice to squeeze in this tech. There are a lot of people eagerly trying to get on board the hype train, but the tech doesn't work and there's no sign in sight that it ever will.

domlebo70 · 2026-03-13T07:27:32 1773386852

Maybe I'm solving different problems to you, but I don't think I've seen a single "idiot moment" from Claude Code this entire week. I've had to massage things to get them more aligned with how I want things, but I don't recall any basic syntax or logic errors.

coffeebeqn · 2026-03-13T12:55:54 1773406554

With the better harness in Claude code and the >4.5 model and a somewhat thought out workflow we’ve definitely arrived at a point where I find it very helpful. The less you can rely on one-shot and more give meaningful context and a well defined testable goal the better it is. It honestly does make me worry how much better can it get and will some percentage of devs become obsolete. It requires less hand holding than many people I’ve worked with and the results come out 100x faster

smackeyacky · 2026-03-13T09:20:27 1773393627

I saw a few (Claude Sonnet 4.6), easily fixed. The biggest difference I’ve noticed is that when you say it has screwed up it much less likely to go down a hallucination path and can be dragged back.

Having said that, I’ve changed the way I work too: more focused chunks of work with tight descriptions and sample data and it’s like having a 2nd brain.

domlebo70 · 2026-03-13T12:17:21 1773404241

Very good way to describe it. I am enjoying Opus a lot.

gilbetron · 2026-03-13T16:19:29 1773418769

I swear some people are using some other tech than I'm using the past few months. Where I work, Claude Code is developing major changes to our very large code base (many repos, millions upon millions of lines of really important code) and pushing to prod regularly. Even the most bearish of engineers are now using it to ship important code daily. It still has issues and you have to know how to use it, but it is a shocking productivity increase (although Amdahl's Law applies for software engineering, too. Coding is only a relatively small percentage of what is done)

Timwi · 2026-03-14T00:05:46 1773446746

> I swear some people are using some other tech than I'm using the past few months.

I'm curious about this discrepancy too. I assume that you're being facetious and the discrepancy is with people's perceptions of AI’s capabilities or usefulness or whatever subjective metric. Some, myself included, seem to perceive it as basically useless, while others, yourself included, seem to imply that it's at a level where it genuinely replaces competent coders.

If the discrepancy were small, it could just be chalked up to the metric being subjective. But it seems to be like night and day. A difference of orders of magnitude. I wanna know what's going on there.

gzread · 2026-03-14T13:11:46 1773493906

Did you lay off some engineers or keep them but make the software better?

cableshaft · 2026-03-13T07:32:01 1773387121

All I know is it feels very different using it now then it did a year ago. I was struggling to get it to do anything too useful a year ago, just asking it to do a small function here or there, often not being totally satisfied with the results.

Now I can ask an agent to code a full feature and it has been handling it more often than not, often getting almost all of the way there with just a few paragraphs of description.

swader999 · 2026-03-13T10:04:08 1773396248

And yet I just eliminated three months (easily) of tech debt on our billing system in the past two weeks.

rudedogg · 2026-03-08T16:36:58 1772987818

Thanks for mentioning, but I did update it to High Effort when I got the notification

rudedogg · 2026-03-01T16:36:34 1772382994

I've been thinking this too. I frequently do deep research on some systems programming technique, ask it to generate a .md for it, and then I use that in later sessions with Claude Code "look at the research I collected in {*-research}.md and help me explore ways to apply it to {thing}".

At the research step it frequently (always?) uses memory to direct/scope the research to what I typically work on, but I think that kind of pigeon holes the model and what it explores. And the memory doesn't quite capture all the areas I'm interested in, or want to directly apply the research to.

And regarding the crap in memories, I found the same. Mine at work mentioned I'm an expert at a business domain I have almost zero experience with.

I feel like the companies building this stuff accept a lot of "slop" in their approach, and just can't see past building things by slopping stuff into prompts. I wish they'd explore more rigid approaches. Yes, I understand "the bitter lesson" but it seems obvious to me some traditional approaches would yield better results for the foreseeable future. Less magic (which is just running things through the cheapest model they have and dumping it in every chat). It seems like poison.

Related: https://vercel.com/blog/agents-md-outperforms-skills-in-our-...

Also, agent skills are usually pure slop. If you look through https://skills.sh on a framework/topic you're knowledgeable in you'll be a bit disheartened. This stuff was pioneered by people who move fast, but I think it's now time to try and push for quality and care in the approach since these have gotten good enough to contribute to more than prototype work.

rudedogg · 2026-02-28T22:12:24 1772316744

It's not much but I was planning to cancel my Anthropic subscription to try Codex over the weekend, but I'll skip that. I don't want to support a company with someone like this at the top. Massive donations to the administration, sneaky backdoor deals. No thanks, fuck you.

rudedogg · 2026-02-27T06:16:21 1772172981

> And if China gets AI, they're more than likely to use it to further raise people out of poverty and automate away more menial jobs without making those displaced workers homeless.

Your comment is very optimistic. But the quoted part reminded me of something I heard (again) about China using slave labor in their lithium mines:

https://www.state.gov/forced-labor-in-chinas-xinjiang-region...

rudedogg · 2026-02-03T02:27:13 1770085633

I made a Zig agent skill yesterday if interested: https://github.com/rudedogg/zig-skills/

Claude getting the ArrayList API wrong every time was a major reason why

It’s AI generated but should help. I need to test and review it more (noticed it mentions async which isn’t in 0.15.x :| )

simonmic · 2026-02-03T19:51:23 1770148283

The linked blog post about making this is an excellent read.

rudedogg · 2026-02-03T21:55:58 1770155758

Thanks! I think I spent as much time writing the post as I did making the skill, so I’m happy someone got some value out of it.

ale · 2026-02-03T14:14:25 1770128065

Fighting fire with fire

rudedogg · 2026-02-03T19:48:25 1770148105

A little bit! I wrote a long blog post about how I made it, I think the strategy of having an LLM look at individual std modules one by one make it actually pretty accurate. Not perfect, but better than I expected