I'm surprised by the almost universally bearish opinion here so far. AI is going to change everything about how we interact with computers. A M4 with a terabyte of RAM will run GPT-4 level models locally. Forget Siri. You will be able to converse directly with your computer as if it was a person. Not only in text, but by voice and even video, and it will be incredibly responsive. The computer will see using the webcam and have conversations about objects you show it. It will read your screen and give you advice about what you're working on, and even perform tasks for you. It will browse the web for you, summarize content, and take actions on your behalf. It will create images of anything you describe, and compose personalized music in any style. It will render an avatar for itself and show emotions, and it will see and react to your facial expressions and gestures.
This is not speculative. Almost all of this stuff is already shown to work, and it is all improving incredibly quickly as we speak! It just needs to be assembled into an actual product, which is exactly what Apple does best. With their silicon advantage they are best positioned to ship these kinds of features, and the best part is it can all run offline locally. No data sent to servers. No privacy concerns. Zero network latency.
> M4 with a terabyte of RAM will run GPT-4 level models locally.
No it won't. Apart from the fact that (if the article is correct), M4 will support maximum of 512 GB (and I don't want to know how many thousands of dollars you will have to spend on it given how much Apple wants you to pay for a difference between 8 GB and 16 GB MacBooks...), GPT-4-level models that you could run locally don't exist and it would be a challenge in itself to create them. The closest, Claude 3 Opus, as far as I can tell can't be run locally, and it is in nobody's interest to change this status quo (less capable models = run locally, powerful ones = pay us).
I do hope this changes one day, but it is not directly related to M4 (or M5/M6).
Apple can create them. They have the resources and can hire or acquire the expertise. And 512 GB should be more than enough for GPT-4 performance. We have invented a whole lot of techniques to reduce RAM requirements since GPT-4 was trained. Of course, GPT-5 and GPT-6 will come out and be better, but even just GPT-4 level performance will be enough for useful applications. And we will undoubtedly continue to discover ways to get better performance with less RAM. We've barely started optimizing these things.
I'm personally looking forward to it. One of the biggest advantages with running locally is cost. Yes, the upfront cost to getting your own hardware is much higher, but I'd much rather pay for something once then worry about paying a monthly bill for using someone else's infra that they may at any time decide to pull out of service or hike the price for.
The "no data sent to servers" point I agree is super important for work things. I pay for and use Github Copilot for personal things but I'm not allowed to use it at work for obvious reasons. If we could remove the privacy concerns and run these models on local hardware (that people actually have) then it'd be game changing.
Reading some of the other anti-AI comments is a bit odd. I guess it's going to depend on what you do for work, but as someone who writes software not using Copilot is like comparing myself from a 0.25x developer to a 2x developer. I mean that when I say it, but I get that you do need to be able to identify wrong code faster than you can run it.
I don't really want any of those things. Predictability and control is what I want from a computer, not to be an ersatz companion that back-seat-drives everything I do.
Like many, I am not overly bearish "on ai" - I am bearish on an "ai-focused laptop in late 2024." I think what you say is roughly true and maybe in six months I'll be wishing my portable computer ran AI faster, but I don't feel that way now. As you say, the full client-side experiences are still in development and, I would wager, more than six months away. Other attempts at "local AI" from less skilled hardware designers have not impressed me and, though I'm eager to see what Apple comes up with, I see no reason to get excited yet.
I agree that the most compelling applications are more than six months away. But I hope that a computer I buy now will last a lot longer than that. I don't upgrade my computer every two years anymore like in the 90s, and I don't want to go back to that. Three or four years from now I think you'll be feeling pretty left out if your computer can't run much AI.
It's relative though. I mean, how excited for an M4 without LLM capabilities are you though? Or an iPhone 16 that's basically the same as the 15 or the 14 with better specs?
I agree with what you are saying, but I think we will have to wait two or three years for the very nice scenario you laid out for us.
Earlier this year I popped for a M2 Pro 32G. I am amazed at what even that can run, but I have to use Ollama to run individual models for general text processing and NLP, one for medical advice, one vision model, one large uncensored model to get an understanding of general utility of censored vs. uncensored models (I wish all models I used were uncensored and trained just on English. I have only used non-English language support for a handful of interactions).
I also want Apple to develop very strong support for cloud models for my Apple Watch. With the excitement and disappointment for the new AI alternative pin computers, etc., I say we already have something useful in the Apple Watch so just keep improving that. I like to go about my day phone-free and the Apple Watch makes that work out OK for me.
We have not reached the "Peak of Inflated Expectations" yet in AI. Contrast that to Blockchain and Cryptocurrency, where we are definitely in the "Trough of Disillusionment".
> Contrast that to Blockchain and Cryptocurrency, where we are definitely in the "Trough of Disillusionment".
Bitcoin has been hovering around its all-time high of around $74,000 and is still expected to continue to increase, especially now that several ETFs have been approved.
I really, really hope that the future you describe doesn’t come to pass. It sounds like hell.
There is an aspect of personal bias for sure: I don’t want my profession or the countless related fields to be devalued, I dislike the imprecision of LLMs, and they spark none of the joy in me that programming does.
But there’s something beyond that I find difficult to put into words. There is a tremendous inelegance to it all, in the same way a brute force solution feels fundamentally unsatisfying.
Most software today only uses a tiny fraction of the capabilities of the hardware, because we optimise for how fast we can implement marketable (and often hostile) features. LLMs and co strike me as dialling that up to 11; why improve anything when we can just throw more hardware at the stuff that sucks?
I like browsing the web god damn it, and I don’t need some corporate facsimile of a human to ask me why I don’t look happy.
> we optimise for how fast we can implement marketable (and often hostile) features.
I actually think local LMs are the antidote for this. Every UI I use these days is optimized by designers for attractiveness and hordes of PMs for business KPIs, not engineers for useful functionality. With local LMs I can take back control over the UI I actually use. The LM can deal with whatever trash UI redesign is being forced on me, and present a much simpler UI that matches my specific needs.
For example, it can summarize the news and I never have to visit a news site and be assaulted by dark patterns. It can collect family photos from Facebook and show me a few without random TikTok videos inserted between them. It can trawl Amazon and Target and Walmart to find a shortlist of products for me to pick from, filtering the trash. It can watch that 10 minute long YouTube video past the sponsored segment to find the nugget of information I need to fix my dishwasher or whatever.
We're a long way from this now. But I see a path there, and a world where users have control over the UIs they use is a better world.
We very much differ in optimism I think. I expect model-driven UIs to end up the same as software and services today: you can use open, non-hostile options if you go out of your way, but most people use the offerings from the big corporations and if you want a friction-free experience you have to as well.
I would love to use a third-party Instagram app that only shows posts from accounts I follow, but I can’t because Facebook has locked that down. I can just as easily see a future where I can summarise my feed, but only if I use the Metabot or whatever which is just implementing some different dark pattern.
Although that does bring me back somewhat to my initial point in that it all feels like brute forcing our way out of problems instead of solving them. We could have a Facebook that shows a nice summary of your family photos, we could have a shopping search engine that doesn’t include the trash, we could have short and to the point videos that aren’t optimising for the engagement flavour of the month.
We don’t because there are too many financial incentives pushing back and too many established players that can burn piles of money for much longer than you, and I just do not see how those same facets won’t just be translated for models.
> I would love to use a third-party Instagram app that only shows posts from accounts I follow, but I can’t because Facebook has locked that down
The point is that this is a local model. You don't have to use Meta's AI because Meta can't stop you from running a model on your own computer to browse Instagram for you and filter it however you like.
Personally I think that, if "model-driven browsing" (for lack of a better term) does take off, we'll see mitigations against it. Not unlike how we currently see mitigations (of mixed effectiveness) against ad-blockers and using things like Puppeteer as scraping tools.
I'm sorry to be so pessimistic about it, and I genuinely do hope we see some positives out of this stuff. But I'm becoming increasingly disillusioned with what so much of the software industry has become, and seeing how much of the AI space is led or backed by the tech giants, I can't help but assume the worst.
I would like something like Perplexity Pro that I could run locally. I have little pieces in the LangChain/LlamaIndex book I wrote and more little pieces in my new open source project [1]. I have become addicted to Perplexity Pro, it makes the web fun again for me, but I would like to have open source code I could run locally with similar functionality (that would be difficult because Perplexity Pro is a brilliant product).
> You will be able to converse directly with your computer as if it was a person.
No thanks, I like that my computer only speaks an incredibly specific language which I can use to tell it exactly what to do, and know it’s not going to do anything else. If I wanted a PA, I would hire one.
I agree. I just still don’t believe in Apple’s marketing in regards to AI. Past M chips were way slower than consumer Nvidia RTX chips. Their only advantage was the available memory. I’m speaking as a member of the Apple cult.
Honestly I agree that the current user interface of Udio and Suno is pretty lame. But it shouldn't take much imagination to see that it won't always be this way. Now that the core capability of generating music is there, a better UI will come. It will be interactive, it will be fun, it will be powerful and sophisticated. It's only a matter of time.
I recently needed a computer to run fusion 360. Options were Mac vs Windows.
I could do a SFF PC or a Mac mini. After looking at the paltry memory/storage Apple chose to give with the mini, I ended up going with an AMD system with 96GB of ram and 4TB of SSD storage for less than a 16GB Mac mini.
For context, if you get the beefiest MacBook Pro right now you'd pay an extra $1000 to upgrade from 48GB to 128GB RAM. For lower end machines you pay more for RAM, $200 per each 8GB you add on top of the 8GB baseline.
But that's if you want Apple to do that for you. What if you buy the memory separately and put it in yourself? I'm considering doing that for my own iMac, and I saw that a 32 GB memory unit is about $65, so 4 of those $260. For laptops it's probably different, but surely it's going to be less than $1000.
Can't add RAM to any Mac (except the Pro, I think).
You have to buy it from Apple with all the RAM it'll ever have. Which is why they overprice the upgrades so much
You can't anymore, none of the Apple Silicon Macs have upgradable memory. If you have an older Intel Mac then it might be possible depending on the exact model.
I am guessing they will be working to bring down those prices given how much of an advantage large RAM is for AI. Maybe they could even bring back RAM slots. Although I guess that wouldn't make sense given that bandwidth is also important and HBM doesn't go in slots.
At Mac Pro pricing (128 GB = $1600) that's $12,800
Of course, I can imagine that there will be a significant discount even just considering the price of high end branded memory outside Apple.
Still, Apple has showed no sign of abandoning 8GB for regular people. Even if they switch by the end of the year on their high model, they have a serious handicap in their own installed base.
Waiting for the EU to force PCs to ship with at least 16GB of RAM and then Apple to immediately announce how they were planning this whole time to double the starting RAM on their base machines then turn it into a 2h presentation about innovation, sustainability and thetics.
> Of course, I can imagine that there will be a significant discount even just considering the price of high end branded memory outside Apple.
Apple wanted $3,000 for 160GB of memory for my Mac Pro (i.e. going from a base 32 to 192GB).
OWC sells the same memory (same spec, same manufacturer) now for $620, though I believe I paid about $1,000.
They also wanted another $3,000 for an 8TB SSD. Similarly, I was able to buy a 4xM.2 PCI card, and populate it with 4 x 2TB SSDs for under $1,500. Furthermore my PCI setup is FASTER than the Apple drive (7GBps versus 5.5).
Was looking at purchasing a PC laptop for a staff member today and was surprised just how much better Apple products are in the current M generation of devices. Better screens, no fans, better battery, better performance … surprising there is not much competition. If you take a Dell XPS and add OLED screen and higher resolutions … quickly find your self near $2,000 with a device that still has less battery life, runs hotter, and although may match or exceed raw numeric performance … is still arguably not as versatile or as good a machine.
The battery life alone is worth kitting out your team with them to be honest. My windows work laptop is dead in under 2 hours. My m1 mbp will last all day 8/9 hours + under normal loads. The only app that eats the batter is football manager and even then it lasts 3-4 hours.
Modern zen4 based laptops can get 15 hours. My 2019 MBP was quoted as 10 hours but gets under 2.... Why? because it's more software than hardware that is the issue. My MBP runs out of battery quickly because outlook (or chrome) uses too much CPU and it drains quickly and runs hot. You get the 10 hours only if then SW lightly loads the CPU. This is the same for Zen and appple Mx CPUs. If you run exactly the same apps you will get similar performance.... The modern zen are very efficient CPUs.
Usually for me the culprit for poor battery life is MS office apps, but under MacOS I've seen window manager consume 90% CPU regularly.
Long story short, buggy SW dictates battery life on modern laptops, with very little between apple and zen (can't speak to modern Intel CPUs, i dont have a laptop with one)
Your 2019 MBP is running Intel chips, not at all comparable to the 2020-onwards M processors. I also had a 2019 MBP kitted out as a work computer before and it barely lasted 3h on a normal workday (meetings, compiling code, running Docker, many, many tabs open on browsers, etc.); my work M1 Max lasts anywhere between 8-12h on a heavier workload than what I worked with in 2019 including all I mentioned in the parenthesis.
So yes, software is the culprit but the M-series are much better at managing the same software (and even heavier ones) that I run, for the same job.
I'm well aware it is Intel, but it was still sold at the time as a 10 hour battery life, and I'm sure it can be, when nothing is running but one tab of safari.
So my point was, and in agreement with you, is when using similar sw (buggy or not) a modern zen4 will have similar battery life to a M1 / M2
Also, I can run Mixtral 8x22B right now on my MacBook. Highly capable, local, (preferably uncensored) models are phenomenal for my productivity. Nothing else comes close to matching Apple silicon's up to 192GB of VRAM for its price.
I agree the lowest specced fanless MacBook Air is very appealing at its price point. But for laptops with fans, if you want to do gaming or AI the performance of a laptop with Nvidia will blow away anything from Apple, and the 2024 version of the ROG Zephyrus G14 looks pretty good.
And catastrophically slower. I think a lot of people who are downloading llama.cpp to run on their M-series MacBook Pro or spending $$$ on high-RAM Mac Studios for AI are not actually aware that Nvidia GPUs are often literally ten times faster for the same applications. You'll almost always be better off either running on a local gaming GPU for things that fit or renting GPUs in the cloud for things that don't, just because the performance gap is so massive.
Excuse me but what does this have to do with the next gen AI chips topic?
Are there any threads left on HN that can be held on topic and not be invaded by the same ol' beaten to death "muh Apple M1 so good, PCs so bad" trope? It's been 4 years already since the M series launched, so unless people have been living under a rock, everyone already knows what they can do, what exactly were you bringing on this topic with your pointless shopping anecdote that has nothing to do with the topic?
And FYI, not everyone is a web dev. Many many jobs will require you to use tools exclusive to Windows or Linux hence why the market for expensive PC laptops is still very large despite being worse on paper than MacBooks.
MacOS doesn't solve everyone's needs no matter how good their chips are, if the SW you need doesn't run on it then it's useless, so Apple is no thereat to that market.
How is his comment helping the Intel engineers build better CPUs? Does he think that Intel and AMD already solved the immense challenges on how to design and build a better CPU but they were just waiting for roody15's low efort comment to finally give the green light to ship them?
Apple seemed to manage just fine. You're saying there's nobody else out there that can do the same? And what's your solution? Suffer in silence? Or do you think yelling at other people on an online forum is somehow more productive?
You must live in some coo-coo land of privileged entitlement if you think that not using Apple M laptops somehow equivalate to "suffering". Go out and get some perspective on what suffering actually feels like. I'm ending my talk with you here since your comments are of even lower quality.
Yeah... hopefully the upcoming Qualcomm Snapdragon X chips get integrated properly by someone so there's actual competition. Mobile data in my laptop also sounds like it would be pretty nice. Not holding my breath though, Microsoft may still mess it up on the Windows side :(
Lenovo and/or Qualcomm managed to mess things up enough on the Thinkpad x13s that the cellular connectivity was provided by a M.2 card rather than the Snapdragon SoC.
Macs are better but absurdly expensive and I've only ever had an awful customer experience with them. For a daily development laptop, I don't think it's worth the extra money over Lenovo and I don't want to support Apple.
I do think it would be cool to have an M_x chip to try and run ML stuff on.
> Apple also plans to add a much improved Neural Engine that has an increased number of cores for AI tasks.
This makes total sense but does it? Does Apple really have an application roadmap to ultimately utilize these tensor cores?
For all the M1, M2, M3s out there, Siri still sucks. We keep seeing latest arXiv papers [1] coming out of their research efforts hinting at potential improvements but there is not much in those papers that needs to wait for an M4 or M5...
I think their main AI case will be photo and video editing or maybe some LLM bots to help your organize and search through files, but Siri feels really abandoned.
IIRC their Pixel 8 chips are so underpowered that they can't even run Google's smallest LLM models locally on device, so despite their whole AI hype sales pitch, everything on the Pixel 8 is still run in the cloud.
How is Google so slow at this? At least the likes of Intel and Microsoft have an excuse, they're old and crusty corporations that move slow.
It's instructive to know where corporations make their money. Because that's often the only place they innovate. For Google it's web/ads. For Apple it's hardware.
> I think their main AI case will be photo and video editing
Sure, AI features will come to all of their content creation apps at some point, most of it being processed on-device.
But I suspect the most impact will be in other apps and areas of the operating systems that typically haven't gotten a lot of attention traditionally.
I'd much rather have generative AI or an intelligent agent to deal with email than filling in backgrounds in images. Because the pertinent information is in my calendar, address book, etc., I should be able to say "email Tom I can't make it to the party" and that email is created for me.
Apple has a fairly severe case of Apple Maps-related PTSD.
They aren't likely to introduce a half-baked product in a rush to market like everyone else is doing.
Apple is unlikely to release a fundamental modern-ML update to Siri without it both breaking new ground in features and being considerably more "safe" to use than other products ("safe" in the sense of keeping Apple out of the headlines).
I think that Apple's biggest sell with this stuff is going to be utilizing all the SQLite databases on OSX, IOS, IPadOS along with all the on-device analytics that are constantly running. I have a feeling that we're going to see Journal become more of a center focus over the coming years.
Question: Is it illegal for apple to do something on-device that runs queries against all the databases for the user's applications?
I have a hunch that Apple hasn't released a folding iPhone simply because they don't want to sell a phone with a crease in the screen. Similarly, they might be sticking with the current Siri design until some LLM flaws can be ironed out.
I am not an Apple user though, so I have very little experience with Siri.
I mean, the rumours at this point are deafening that Siri is getting a complete rehaul (or an outright replacement, which probably amounts to the same thing) at WWDC in just two months, so I'd reserve judgment on Apple's AI deployment capabilities until then.
> Surely running local LLMs isn’t that big a market.
Agreed. That said, if you do AI/LLMs, few, if any, portable non-Mac systems have the ability to pull this off. The Macs just crush it when it comes to GPU and memory bandwidth performance, and they do this while sipping battery life.
An LLM able to e.g. summarize the gist of a long text conversation in a group chat that I missed, able to understand queries like "what time did X say we'd meet up, and how long would it take me to get there" etc. would be incredibly useful to me.
Sure, most of that is possible server-side – but the appeal would be that local processing could preserve a lot more privacy.
Right now there's not, but the market for smartphones was pretty small until the iPhone came out. If anybody could create a market for something, I'd believe Apple could do it.
Is the lack of LLM siri already because the on-device compute would totally kill battery life? And battery life is a sacred criteria that is unacceptable to regress on for mobile?
> This makes total sense but does it? Does Apple really have an application roadmap to ultimately utilize these tensor cores?
Apple always does things in a holistic way. They wouldn't put these AI cores in their devices if they didn't also have a software strategy to go along with it.
Unlike with most companies, AI won't be bolted on; it'll well-integrated into the operating system.
For example, they've been working on LLMs[1]:
> We demonstrate large improvements over an existing system with similar functionality across different types of references, with our smallest model obtaining absolute gains of over 5% for on-screen references. We also benchmark against GPT-3.5 and GPT-4, with our smallest model achieving performance comparable to that of GPT-4, and our larger models substantially outperforming it.
And it's not going to require a terabyte of RAM either.
You can read the details: "ReALM: Reference Resolution As Language Modeling" [2]
There's also evidence of an AI browsing assistant for Safari. [3]
The M-series is just so phenomenal for AI. Mixtral is my MacBook's daily driver, and will try the 8x22B variant soon. Being capable of that on a $3k-$4k laptop still blows my mind. I'm glad that Apple's leaning into AI.
(Aside: Apple's MLX framework is showing nice performance gains over PyTorch on their silicon. I'm excited to see that project evolve.)
Hope they can deliver. Right now Apple hardware is silly compared to PC+Nvidia if you wanna play around with GenAI. Both in price and performance.
Worst case macs end up as thin-clients, all AI running on Nvidia in the cloud. Would eat into their competive advantage a lot I think.
Right now Apple hardware is silly compared to PC+Nvidia if you wanna play around with GenAI.
You mean it's silly how far ahead Apple is since they offer 192 GB of VRAM while Nvidia only allows 24 GB for reasonable prices? Or do you mean it's silly to compare <$10K Macs with >$30K Nvidia setups in the first place?
What would be silly is running a 192 GB ML model on a chip that slow. In practically every case you would be better off with a multi-GPU PC or a cloud GPU instance, simply because the performance gap is so enormously massive. You can buy a whole lot of cloud GPU hours for the price of 192 GB in a Mac, especially when you consider that you don't have to pay extra for the electricity and you don't need the very latest chips to far outperform Apple's best.
If I wanted to build a PC today that could run the big models that were released recently (For example, Mixtral 8x22B and Command-R with as little quantization as possible) what would I buy?
I predict Apple's launch events in the fall are going to be off the charts in terms of productizing AI.
I'm not an Apple fanboy - I just think Apple as a company has been prepared and thinking about this stuff for literally decades. They set the bar for user friendly products. It won't be the first time that Apple is late to the market, but they've always redefined it when they arrive, becoming the new standard. It's what they do.
I'm a firm believer in AI at the edge: Low latency, privacy, personalization, device integration and no need to invest in massive AI server farms. It's in Apple's best interest to bring AI computation to the masses.
It is possible it is the next self-driving car, or nuclear fusion.
A sudden exponential set of improvements that allowed everyone to dream that self-driving anything was around the corner. Actual promising real world result that show we are really no that far.
But the last bridge to cross is actually extremely slow and what was around the corner, becomes a decade away.
That said, it's not fruitless. AI is useful as it is today, and if does not go pick up your kids at school in the next 10 years, it would still be useful.
This is not speculative. Almost all of this stuff is already shown to work, and it is all improving incredibly quickly as we speak! It just needs to be assembled into an actual product, which is exactly what Apple does best. With their silicon advantage they are best positioned to ship these kinds of features, and the best part is it can all run offline locally. No data sent to servers. No privacy concerns. Zero network latency.