This is almost exactly how Windows voice control works. It keeps dividing the screen into smaller and smaller boxes labeled with numbers that you can speak to focus into.
How is OpenAI Voice mode any different than a Whatsapp call? Ignoring the part that there is a GPU on the other side instead of a human. But what is the technical challenge in the voice call portion? It seems like that has been a solved problem for a long time now.
These guys are at the absolute frontier, why can't they rigorously find the exact weights that are causing this problem? That's how software "engineering" should work. Not trying combinations of English words and hoping something works. This is like a brain surgeon talking to his patient hoping he can shock his brain in the right way that fries the tumor inside. Get in there and surgically remove the unwanted matter!
LLM’s aren’t software (except in an uninteresting obvious sense); they are “grown, not made” as the saying is. And sure, they can find which weights activate when goblins come up (that’s basic mechanistic interpretability stuff), but it’s not as simple as just going in and deleting parts of the network. This thing is irreducibly complex in an organic delocalized way and information is highly compressed within it; the same part of the network serves many different purposes at once. Going in and deleting it you will probably end up with other weird behaviors.
Imagine someone deleting goblin neurons. In your brain.
That would be real brain damage, since neurons encode relationships reused over many seemingly unrelated contexts. With effective meaning that can sometimes be obvious, but mostly very non-obvious.
In matrix based AI, the result is the same. There are no "just goblin" weights.
This is so extremely annoying when paired with the forced auto restarts. Here is how it works:
1. I walk away from the computer with a bunch of my tabs and programs running. I also have a couple of servers running (docker compose).
2. Microsoft decides my work is worthless and restarts the PC to install updates to fix their own shoddy programming.
3. After 3-4 restarts, it finally drops back into the login screen. So my open apps, tabs, servers are all gone, and will not be running. Basically means I cannot rely on the PC being online if I am outside.
4. And on top of that, even when I enter the password, it will pretend to login, but stops on this spam screen with the anti-pattern "remind me later" button. Every single time. I've told them no for at least 50 times. Oh and this screen blocks every startup program from loading, even though I have signed in. So I have to clear it before docker will load.
Docker runs better on Linux. And Linux doesn't reboot unless told to do so. And you can SSH into it rather than run some sort of long winded GUI. Oh, and it doesn't phone home all the time with 'telemetry'.
I do my work on this PC. And even though almost all of it is on WSL, the rest of the OS just works better than any Linux I've tried. And I'm no stranger to Linux. My first was the Ubuntu Dapper Drake CD they used to mail out for free. So I put up with all the abuse :(
I have a fully functional Arch Linux on a secondary SSD, but it's just a pain to deal with all the Bluetooth audio quirks, the uncanny valley GUI, incomplete app support, etc. I'll muster enough willpower one day to fully make the arch boot by default.
I switched to Linux in 2018, and I hear you on the quirks. But now, an LLM can fix basically any quirk you hit. I’ve been surprised multiple times. I’m also on Pop OS, which feels more batteries-included than Ubuntu.
If you disable it as outlined in the article it will not come back IME. It is ridiculous and frustrating that you have to do it, and IMO it's extremely poorly named and placed but it does work at least.
I'd say that coverage is very, very substantial, but incomplete because some games use anti-cheat that is either extremely invasive and heavily relies on Windows internals, or is anti-cheat that the devs have configured to reject running in Proton.
> basically-every-current-multiplayer-shooter is a big missing category.
Weird. I've been playing many multiplayer shooters from Proton with my Windows-using friends. I suppose this is one of those "am I friends with people who pretty much only play CoD or Fortnite?" things.
Not that I disagree with you, but some of us still use desktops because laptops don't always get the job done. That being said, how hard is it to hit ctrl-s
Have they gotten the memo reminding them “schedule and execute your updates to avoid waking up to a login screen”? I feel sorry for you having to work with clients who run production workloads on Windows 11.
I think I read somewhere that calculating and limiting cloud usage costs is a really hard problem. But I feel that if Google were motivated to do it, they can do it. It's hard, not impossible. They just don't care to solve this particular problem.
If they can COUNT it and charge based on that, that means they can count it and react.
If I, not having their budget or engineers, can have pretty much instant Prometheus event reacting to metrics, surely it wouldn't be too hard for them to have triggers like this -- somehow their AI can automatically ban people based on something, can't they do something for the customers?
It's the same fundamental problem as view counters, something Google is famously good at solving. Eventually consistent solutions are well-understood, and wouldn't have these kinds of massive cost-overruns.
It's more a problem they are incentivized to have. Open Router allows fixed wallets and doesn't run into the same problem, since it would be their money on the line if they let a user overspend their limits.
It seems hard to believe that a one-hour delay on such a counter is impossible to achieve, and one hour would reduce the risk from "catastrophic" to "serious problem" in most cases.
Also, if implementing a cap is a desired feature that justifies trade-offs to be made, then it is psosible to translate the budget cap (in terms of money) back into service-specific caps that are easier to keep consistent. Such as "autoscale this set of VMs" and "my budget cap is $1000/hour", with the VM type being priced at $10/hour, translated to "autoscale to at most 100 instances". That would need dev work (i.e. this feature being considered important) and would not respect the budget cap in a cross-service way automatically, but still it is another piece in the puzzle.
Eh, suddenly turning off all services in your account because you hit your cap is just as much a DoS type event - just of your services, not your wallet.
So? Many would prefer a DoS-type event over spending $WHATEVER_THEIR_HARD_CAP_IS. This is kinda the definition of a hard cap, so you would place it sufficiently high that DoSing your system is indeed preferable.
Also, doing this on a per-service basis doesn't seem that far-fetched to me, so you'd only kill that service and get at least some chance that the rest of your system remains usable.
If you have an actual enforced cap, those services will be disabled until you resolve the cap - which depending on the latency for usage updates, may be hours after you pass the cap, and hours after you resolve the issue.
Or you have ‘warnings’, and your services keep working, but you spend more $$.
Previously, people seemed to be more worried about service outages than raw $$. Now it’s the other way around.
It’s a common issue with disk quotas in on-prem systems too, and they tend to cause a lot of similar types of problems in both directions.
Yeah, there's an implicit assumption was reasonability.
But a big part of the value in large clouds like GCP is the network's interconnectedness. Plus even if there was some global event that made communications impossible only for the billing service, I'd still expect charges to top out roughly proportional to the number of partitions as they each independently exceed the threshold. GCP only has 120ish zones.
They charge for a lot of things "by the hour". Things like S3, load balancers, storage.
Deleting those when a customer hits a limit will lose customer data or remove things that might be hard to add back. The "I hit my AWS limit and they deleted all my data" headlines will result.
and excluding those things makes the limit soft again..
I mean yes, look at Corey Quinn [1] for example. He has built an entire career out of the fact that cloud billing trips people up.
(Generally, tech seems to skate by on creating insanely complicated things, knowing that given enough pain, people will start blogging about their solutions, ie effectively outsourcing the cost and effort of doing something about it.)
Tech skates by on monopoly/oligopoly power. This arises because big players are allowed to buy competitors whenever they like. And since they are already monopolies/duopolies, they have unlimited money for such purposes. Killing off WhatsApp was chump change for Facebook.
We essentially don’t have monopoly enforcement in the US anymore
Keyboard shortcuts are truly a mess on mac os. Windows does it much better and with more consistency. That results in third party apps also having sensible shortcuts. Example: Ctrl+G is widely used in code editors for "Goto line". On Windows it makes perfect sense to use because Ctrl+ shortcuts are used for text editing everywhere. But on macos it is out of place, because there Cmd+ is the standard for text editing. But Cmd+G is used for some obscure find feature. So editors fall back to Ctrl+G which is out of place.
The "goto line" feature on most Mac text editors is Cmd+L. And it's consistent.
On the Mac the Control shortcuts are used for text manipulation everywhere and they come from Emacs: C-a, C-e, C-f, C-b, C-k, etc. The Cmd key is not the standard for text editing; it is the standard for all app-specific commands. For example Cmd+I usually makes text italic in a word processor, but in a non-word processor app italic makes no sense, so for example in Finder it means bring up the inspector.
I don’t know why this comment is downvoted, but I don’t agree with this either because the OS (historical) conventions are different, and there may be unintuitive shortcuts on all OSes. What matters is consistency across applications on the same OS.
One point on macOS is that it’s very weak on keyboard based navigation and shortcuts for apps by default (compared to Windows). Even Apple doesn’t bother with keyboard based navigation in its own apps. One look at any app “ported” from iOS is enough. Apple hasn’t even spent time to check what the Tab key does in these apps. It’s a shame.
The "trust project" feature has been designed to be so extremely intrusive and annoying that the first thing I do is to completely disable it whenever I install VS Code on a new computer. This "solution" was just done to tick some box and put the blame on the user when a security incident happens. It's pretty similar to Windows Vista where it annoyed you with a disruptive popup so many times during the normal course of actions that most people ended up disabling the whole UAC system. Overall security goes down, and Microsoft has a nice excuse.
> It's pretty similar to Windows Vista where it annoyed you with a disruptive popup so many times during the normal course of actions that most people ended up disabling the whole UAC system.
Nothing changed post-Vista. It's exactly the same system in Windows 11 doing exactly the same thing. It did, however, get developers to change how they do things.
To be honest, the solution here is probably more dialogs like this, not less. Having one single "Trust everything here but if you don't then nothing will work" box is hardly a good way to go.
Vista's annoyance had a purpose, to get program developers to change things to run without escalation. They didn't want you disabling UAC, and these days it breaks things to disable UAC.
By only having an upfront project-wide toggle, VS Code is much worse.
Yeah imagine if at boot Windows Vista gives you the UAC "Do you TRUST all the software you are going to run today?" and if you say yes then it just allows any random code to do whatever it wants.
Yes, that puzzles me too. Not only do I not know what the author means, I'm not sure what it could mean: teaching material for wasm is generated by many independent people, each for their own tools and purposes. There is no organization behind all that, much less a philosophy.
reply