I mean, even if the technology stopped to improve immediately forever (which is unlikely), LLMs are already better than most humans at most tasks.
Including code quality. Not because they are exceptionally good (you are right that they aren’t superhuman like AlphaGo) but because most humans are rather not that good at it anyway and also somehow « hallucinate » because of tiredness.
Even today’s models are far from being exploited at their full potential because we actually developed pretty much no tools around it except tooling to generate code.
I’m also a long time « doubter » but as a curious person I used the tool anyway with all its flaws in the latest 3 years. And I’m forced to admit that hallucinations are pretty rare nowadays. Errors still happen but they are very rare and it’s easier than ever to get it back in track.
I think I’m also a « believer » now and believe me, I really don’t want to because as much as I’m excited by this, I’m also pretty much frightened of all the bad things that this tech could to the world in the wrong hands and I don’t feel like it’s particularly in the right hands.
Well I think it's nothing more than a social norm, and an easy one to avoid at that. People are mostly asking what's your job because that's a standard icebreaker.
Since I (mostly) recovered from burnout, and learnt that I'm actually not my job, I took the habit to never automatically ask people what is their job, at least not for ice breaking.
You can talk about their hobbies, their kids, their tastes ... because those are the real topics that will define if you bond or not anyway. And yes some people sometimes do have an interesting job that is worth talking about but when it happens, you will inevitably talk about it anyway.
I have the same issue with AI generated music : it can be quite good to say the least.
But I deeply feel that art only matters if there is an artist. The artist wants to convey something.
What makes you uneasy (if you are like me) is that a machine deliberately created emotions in your brain. And positive emotions, at that. It’s really something I can’t stand.
I different way of reframing this point is looking at some of the modern art that's highly celebrated, without the human component of what it represents, the art itself isn't that good.
So, the guy who suspends buckets of paint with a hole in the bottom to make patterns has an idea of what he's creating. The guy who just put a few strips of electrical tape in different colours had an idea of what he was trying to convey. The guy who flings paint against a wall also has an idea of what he's creating. The guy who made all the white paintings. All that art is trivial to copy in the same style, maybe even an exact copy for the electrical tape, but it's the artist's intention that makes it worth more than a toddler's painting.
Personally, I think most of that abstract art is pointless, because I don't really see how the artist's vision is represented by whatever the mess they've created is, but I definitely understand that at least they had an idea that they wanted to convey. A machine creating the same thing has no meaning behind it, it's just a waste of paint and canvas.
OpenAI has some magic they do on their standalone endpoint (/responses/compact) just for compaction, where they keep all the user messages and replace the agent messages or reasoning with embeddings.
> This list includes a special type=compaction item with an opaque encrypted_content item that preserves the model’s latent understanding of the original conversation.
Not sure if it's a common knowledge but I've learned not that long ago that you can do "/compact your instructions here", if you just say what you are working on or what to keep explicitly it's much less painful.
In general LLMs for some reason are really bad at designing prompts for themselves. I tested it heavily on some data where there was a clear optimization function and ability to evaluate the results, and I easily beat opus every time with my chaotic full of typos prompts vs its methodological ones when it is writing instructions for itself or for other LLMs.
You can also put guidance for when to compact and with what instructions into Claude.md. The model itself can run /compact, and while I try to remember to use it manually, I find it useful to have “If I ask for a totally different task and the current context won’t be useful, run /compact with a short summary of the new focus”
so you have to garbage collect manually for the AI?
also, i don't want to make a full parent post
1M tokens sounds real expensive if you're constantly at that threshold. There's codebases larger in LOC; i read somewhere that Carmack has "given to humanity" over 1 million lines of his code. Perhaps something to dwell on
The same Asahi developers also wrote about how Apple didn’t document anything and especially, Apple never talked in public about this. Apple betting Apple, If they had cared a single second about this, they would have called this Bootcamp 2.
Honestly I’m pretty convinced that this « open » bootloader was just there to avoid criticism and bad press from specialized outlets when they presented the M1 because, for once, they needed specialized outlet to benchmark the M1 performance and not have anything bad to say about anything else.
They constantly break everything year after year without documenting any change which effectively makes Asahi unusable in anything recent.
I’m betting that they are just patiently waiting for Asahi to die by being too late of several years (which is already the case) to announce « The most secure Mac ever » silently releasing with closed bootloader when nobody and especially the press will care anymore.
Don’t get me wrong, I love Asahi and I even have it installed on my M2 Air, the project is doing incredible quality work. But I don’t believe it will last long. Hope I’m wrong, though.
For them to call it Bootcamp 2 (a "product" per se), they'd have had to have another OS they could actually demo installing. Otherwise "Bootcamp 2" is just a mysterious empty chooser window.
But at the time there was nothing, because Apple Silicon wasn't a platform anyone but them was targeting, because they had just created it.
So they built the infrastructure, and then waited for someone to actually start taking advantage of it, before bothering to acknowledge it.
And because that "someone" isn't a bigcorp (i.e. Microsoft) wanting to do a co-marketing push, but just FOSS people gradually building something but never quite "launching" a 1.0 of it — Apple just "acknowledged" it quietly, at developer conferences, exposing it only via developer-centric CLI tooling, rather than with the sort of polished UI experience they would need if Microsoft was trying to convince Joe Excel User to dual-boot Windows on their Apple Silicon MBP.
> announce « The most secure Mac ever » silently releasing with closed bootloader
That's extremely unlikely to happen, as Apple's hardware and OS developers build Macs and macOS (and all the other hardware + OSes) using Macs and macOS. And those engineers (and engineers working at Apple's hardware and accessory manufacturing partners) will always need to be able to diddle around with the kernel and extensions "in anger" without needing to go through a three-day-turnaround code-signing process.
There's a whole proprietary, distributed kernel development and QC flow for macOS, that looks a lot like the Linux one (i.e. with all the same bigcorps involved making sure their stuff works), but all happening behind closed doors. But all the same stuff still needs to happen regardless, to ensure that buggy drivers don't ship. Thus macOS kernel development mode being just one reboot-and-toggle away.
> And because that "someone" isn't a bigcorp (i.e. Microsoft) wanting to do a co-marketing push, but just FOSS people gradually building something but never quite "launching" a 1.0 of it — Apple just "acknowledged" it quietly, at developer conferences, exposing it only via developer-centric CLI tooling, rather than with the sort of polished UI experience they would need if Microsoft was trying to convince Joe Excel User to dual-boot Windows on their Apple Silicon MBP.
It's also important to remember that Microsoft was in the middle of their Qualcomm exclusivity deal at the time of the M1's release, and thus Windows for ARM wasn't available on anything other than a few select devices or unofficial use of Insider builds.
That deal didn't actually expire until 2024[1], at which point Windows for ARM finally started to be sold in an official capacity with stable builds widely available.
It's entirely possible, though unconfirmed, that Apple was intentionally leaving the door open for "Boot Camp 2", and Microsoft simply never took them up on the offer, either because they were stuck in a deal made prior to the M1's release that prevented it, or because they no longer saw a financial benefit to being able to sell Windows to Mac users (possibly since Windows license sales are effectively a rounding error to Microsoft at this point; they make way more off of subscription services and/or Office, all of which are already available on macOS without having to dual-boot Windows).
> possibly since Windows license sales are effectively a rounding error to Microsoft at this point; they make way more off of subscription services and/or Office, all of which are already available on macOS without having to dual-boot Windows
AFAICT, the way Microsoft wants things to work, is that "Windows" is the native fat-client platform / SDK that ISVs are supposed to use/target when building fat-client apps that interact with (i.e. generate spend on) Azure-based backend systems. The #1 way Microsoft makes money at this point isn't from direct consumer or even volume-licensed subscriptions; it's from providing paid backend infra to dev shops who had long since locked themselves into the Microsoft/Windows development ecosystem, and who therefore saw Azure as the only valid cloud backend to integrate with when "cloud-enabling" their software (and/or, where the compliance story of integrating their previously native-and-local-syncing software with Azure, was 100x simpler than with integrating with any other cloud, due to Azure+Windows being able to act as a trusted principal-agent pair that can enforce policy-based security via a shared "cloud domain" identity [Entra ID] baked right into the OS ACL layer.)
Until recently, though, Microsoft thought of the Windows "platform" the same way Apple do of the Mac "platform": that "Windows"-the-platform-SDK was the same thing as Windows-the-OS. Which necessarily meant that consumers must be pushed with all conceivable effort toward using Windows-the-OS on their machines, so that these dev shops who had targeted Windows-the-SDK could reach them with their software (so that those dev shops would in turn spend more on Azure.)
But I think this equivalence is going away!
From what I've seen of discussions in various Microsoft-aligned sources recently, it feels to me like some part of what Windows 12 may mean by calling itself a "modular OS", is that Microsoft may be establishing some kind of very clean boundary layer between Windows-the-OS and Windows-the-platform/SDK.
---
What would that look like? I don't know for sure, but here's some spitballing:
Picture Mono, but as a complete UWP projection, shipping with all the native libraries that are built into Windows.
Or, if you'd prefer, picture Wine/Proton; but rather than black-box-reverse-engineered equivalents to Windows DLLs, it is all the DLLs that come with Windows. Except, now rebuilt from the ground up so that they compile against NTOS or Mach or Linux syscalls.
Basically, "the complete Windows platform" as a JVM-like runtime you "get for free" when installing Windows-the-OS, but can install on top of macOS and Linux. (Probably in various runtime profiles, as with Embedded vs Server vs Desktop JREs. You don't need D3D on your server.)
This would be likely to take 100% of the wind out of the sails of the Wine/Proton projects overnight. And maybe kill Mono itself, too. After all, why bother with half-assed third-party implementations of the Windows Platform, when you can just install the "real" Windows Platform, and get guaranteed bug-for-bug compatibility with existing Windows software (relying on the same databases of app shims and fixes Windows-the-OS has shipped with for ~forever)?
SteamOS would be reduced to "a Linux distro that preinstalls the Windows Platform." ReactOS might or might not (depends on stubbornness) be reduced to "a clean-room-implemented NTOSKRNL-compatible base OS, that preinstalls the Windows target of the Windows Platform."
Wine/Proton themselves would, if they even bothered to keep going, end up rebranded as "an alternative ground-up Windows Platform runtime." (If the official "Windows Platform runtime" was then open-sourced, then likely Wine/Proton would fully fade into obscurity, as anyone who wanted to maintain their own libre Windows Platform runtime would just start by forking Microsoft's. Very similar to the situation with OpenJDK.)
---
In any case, regardless of how they do it, any move in this direction would make it blindingly obvious why Microsoft wouldn't care about enabling something like a "Boot Camp 2" feature on Macs any more: they no longer care if you install Windows-the-OS on a Mac; they rather want you to install the Windows Platform runtime under macOS. And then they'll have you as a consumer of Windows Platform products all the same.
(Actually, even better, as they'll have you far more of the time. In the Boot Camp business strategy, any time you spend booted into macOS puts you out of Microsoft's reach, save for the few first-party apps Microsoft has ported to macOS + sells on the App Store. In the Windows Platform business strategy, meanwhile, you can be running arbitrary Windows Platform apps on your Mac [and so generating Azure spend for some ISV somewhere] 100% of the time you're using it!)
Crystal ball, maybe, but 3 years ago, the AI generated classes with empty methods containing "// implement logic here" and now, AI is generating whole stack applications that run from the first try.
Past performance does not guarantee future results, of course. But acting like AI is now magically going to stagnate is also a really bold bet.
> now, AI is generating whole stack applications that run from the first try
I sincerely doubt that, because it still can't even generate a few hundred line script that runs on the first try. I would know, I just tried yesterday. The first attempt was using hallucinated APIs and while I did get it to work eventually, I don't think it can one shot a complex application if it can't one shot a simple script.
IMO, AI has already stagnated and isn't significantly better than it was 3 years ago. I don't see how it's supposed to get better still when the improvement has already stopped.
I routinely generate applications for my personal use using OpenCode + Claude Sonnet/Opus.
Yesterday I generated an app for my son to learn multiplication tables using spaced repetition algorithm and score keeping. It took me like 5 minutes.
Of course if you use ChatGPT it will not work but there is no way Claude Code/Open Code with any modern model isn't able to generate a one hundred line script on the first try.
I was asking which tool, not which, not which model.
For the same model, you can just have it generate dad jokes or use it in a tool like OpenCode or Cursor or Zed or Cline or … and make it program complex things.
If I use Claude Sonnet on duck.ai I will have hard time generating something interesting. The same model in OpenCode does all my programming work.
I mean, the person above complaining about it not being able to create a simple thing is absolutely holding them wrong! They aren’t feeding the right context, aren’t using the correct tools or harnesses, who knows. But the problem exists between keyboard and chair, so to speak.
I’m constantly amazed at the amount of scope I can now one-shot with Claude Code. It can crank out multi command cli apps with almost zero hand holding beyond telling what to generate… you know, the hard part. And then we’ll back and forth to refine the working thing it built.
>isn't significantly better than it was 3 years ago.
Eh?
Ever hear the saying the first 90% of a problem is 90% of the work, the last 10% of the program is also 90% of the work.
AI/LLMs have improved massively in that context. That's not even including the other model types such as visual/motion-visual/audio which are to the point that telling their output from reality is a chore.
And one shotting a simple script simply doesn't mean much without context. I have it dump relatively complex powershell scripts often enough and it's helped me a lot with being able to explain scripting actions to other humans where before I'd make assumptions about the other users knowledge where it was not warranted.
The biggest grift is invested tech Bro's trying to sell you on thr fact that Ai growth is linear or even exponential.
In reality it's Logarithmic. Maybe with the occasional jolt. You'd think with Moores "law" that we'd know better by now that explosive growth isn't forever. Or at least that we're bound to physics as a cap to hit.
Including code quality. Not because they are exceptionally good (you are right that they aren’t superhuman like AlphaGo) but because most humans are rather not that good at it anyway and also somehow « hallucinate » because of tiredness.
Even today’s models are far from being exploited at their full potential because we actually developed pretty much no tools around it except tooling to generate code.
I’m also a long time « doubter » but as a curious person I used the tool anyway with all its flaws in the latest 3 years. And I’m forced to admit that hallucinations are pretty rare nowadays. Errors still happen but they are very rare and it’s easier than ever to get it back in track.
I think I’m also a « believer » now and believe me, I really don’t want to because as much as I’m excited by this, I’m also pretty much frightened of all the bad things that this tech could to the world in the wrong hands and I don’t feel like it’s particularly in the right hands.
reply