More

anematode · 2026-04-06T08:02:13 1775462533

This was already posted: https://news.ycombinator.com/item?id=47047936

It contains many factual errors.

bgs_ · 2026-04-06T08:14:27 1775463267

Hey author here. I have actually rewritten the parts where you might call AI slop. I have tried to correct the text to the best of my abilities.

https://bgslabs.org/blog/evolution-of-x86-simd/#acknowledgem...

If you look at there I tried to be as transparent about this process as possible. I simply didn't know any better than to use AI to fact check my data when I first started - which was a really bad idea and led to the horrendous outcome as you've seen there. I am not trying to hide anything, I made a mistake. If you could give the article a re-read and tell me where I might have gone wrong I would be really happy. I actually want this to be a good and useful educational resource, not AI slop.

Thank you for your time regardless.

anematode · 2026-04-06T07:10:36 1775459436

Quoting from the README:

> The entire VSCode workbench - editor, terminal, extensions, themes, keybindings — ported to run on a native shell.

but also

> Many workbench features are stubbed or partially implemented

So which is it? The README needs to be clearer about what is aspirational, what is done, and what is out of scope. Right now it just looks like an LLM soup.

chii · 2026-04-06T07:12:20 1775459540

The first sentence is aspirational, while the second is describing current state.

anematode · 2026-04-06T07:31:16 1775460676

See I thought that, but then in the putatively aspirational section it says

> 5,600+ TypeScript files from VSCode's source, ported and adapted

which doesn't really make sense as a goal?

KendallCBooker · 2026-04-06T10:45:06 1775472306

Hey Creator of Sidex, I just updated the Read me to be alot clearer would love your feedback on it to help clear the air

anematode · 2026-04-06T04:35:13 1775450113

Nice post :)

Last year I was working on a tail-call interpreter (https://github.com/anematode/b-jvm/blob/main/vm/interpreter2...) and found a similar regression on WASM when transforming it from a switch-dispatch loop to tail calls. SpiderMonkey did the best with almost no regression, while V8 and JSC totally crapped out – same finding as the blog post. Because I was targeting both native and WASM I wrote a convoluted macro system that would do a switch-dispatch on WASM and tail calls on native.

Ultimately, because V8's register allocation couldn't handle the switch-loop and was spilling everything, I basically manually outlined all the bytecodes whose implementations were too bloated. But V8 would still inline those implementations and shoot itself in the foot, so I wrote a wasm-opt pass to indirect them through a __funcref table, which prevented inlining.

One trick, to get a little more perf out of the WASM tail-call version, is to use a typed __funcref table. This was really horrible to set up and I actually had to write a wasm-opt pass for this, but basically, if you just naively do a tail call of a "function pointer" (which in WASM is usually an index into some global table), the VM has to check for the validity of the pointer as well as a matching signature. With a __funcref table you can guarantee that the function is valid, avoiding all these annoying checks.

functional_dev · 2026-04-06T07:14:27 1775459667

The article shows WASM being 1.2-3.7x slower, and your experience confirms it.

Do you have any idea which operations regress the most?

anematode · 2026-04-06T07:34:11 1775460851

Based on looking at V8's JITed code, there seemed to be a lot of overhead with stack overflow checking, actually. The function prologues and epilogues were just as bloated in the tail-call case. I'll upload some screenshots if I can find them.

anematode · 2026-04-03T03:54:43 1775188483

Looks like a very sophisticated operation, and I feel for the maintainer who had his machine compromised.

The next incarnation of this, I worry, is that the malware hibernates somehow (e.g., if (Date.now() < 1776188434046) { exit(); }) to maximize the damage.

ffsm8 · 2026-04-03T10:33:07 1775212387

Isn't that already how it is?

I mean the compromised machine registers itself on the command server and occasionally checks for workloads.

The hacker then decides his next actions - depending on the machine they compromised they'll either try to spread (like this time) and make a broad attack or they may go more in-depth and try to exfiltrate data/spread internally if eg a build node has been compromised

anematode · 2026-04-02T03:42:49 1775101369

> But then the clean room implementations started showing up. People had taken Anthropic’s source code and rewritten Claude Code from scratch in other languages like Python and Rust.

Seems like the phrase "clean room" is the new "nonplussed"... how does this make any sense?

mergesort · 2026-04-02T04:21:05 1775103665

Heya, post author here. I think I was just wrong about this assertion. I got into a discussion with a copyright lawyer over on Bluesky[^1] after I wrote this and came away reasonably convinced that this wouldn’t be a valid example of a clean room implementation.

[^1]: https://bsky.app/profile/mergesort.me/post/3mihhaliils2y

aeternum · 2026-04-02T05:06:15 1775106375

The most fitting method would be to be to train an LLM on the Claude Code source-code (among other data).

Then use Anthropic's own argument that LLM output is original work and thus not subject to copyright.

recursive · 2026-04-02T03:47:44 1775101664

I think it means you write a spec from the implementation. Then you write a new implementation from the spec. You might go so far as to do the second part in a "clean" room.

m132 · 2026-04-02T04:46:49 1775105209

Heh, the original being entirely vibed had me thinking of an interesting problem: if you used the same model to generate a specification, then reset the state and passed that specification back to it for implementation, the resulting code would by design be very close to the original. With enough luck (or engineering), you could even get the same exact files in some cases.

Does this still count as clean-room? Or what if the model wasn't the same exact one, but one trained the same way on the same input material, which Anthropic never owned?

This is going to be a decade of very interesting, and probably often hypocritical lawsuits.

roywiggins · 2026-04-02T03:51:29 1775101889

right. that's not what people are doing here though, at all

john_strinlai · 2026-04-02T03:52:34 1775101954

in a typical clean-room design, the person writing the new implementation is not supposed to have any knowledge of the original, they should only have knowledge of the specification.

if one person writes the spec from the implementation, and then also writes the new implementation, it is not clean-room design.

post_below · 2026-04-02T03:58:35 1775102315

I believe the argument is that LLMs are stateless. So if the session writing the code isn't the same session that wrote the spec, it's effectively a clean room implementation.

There are other details of course (is the old code in the training data?) but I'm not trying to weigh in on the argument one way or the other.

anematode · 2026-04-02T00:43:51 1775090631

Arguably, an even worse day to release it ;)

DANmode · 2026-04-02T03:08:08 1775099288

anematode · 2026-03-30T21:47:23 1774907243

Ya, I tend to believe that (most) human VR will be obsoleted well before human software engineering. Software engineering is a lot more squishy and has many more opportunities to go off the rails. Once a goal is established, the output of VR agents is verifiable.

anematode · 2026-03-30T15:35:07 1774884907

Definitely. As an extreme but fun example... in one project I had a massive hash map (~700 GB or so) that was concurrently read to/written from by 256 threads. The entries themselves were only 16 bytes and so I could use atomic cmpxchg, but the problem I hit was that even with 1GB huge pages, I was running out of dTLB entries. So I assigned each thread to a subregion of the hash table, then used channels between each pair of threads to handle the reads and writes (and restructured the program a bit to allow this). Since the dTLB budget is per core, this allowed me to get essentially 0 dTLB misses, and ultimately sped up the program by ~2x

senderista · 2026-03-30T16:57:09 1774889829

The "delegation pattern" for datastructures:

https://timharris.uk/papers/2013-opodis.pdf

anematode · 2026-03-30T16:58:48 1774889928

ah! I thought I was being original :)

anematode · 2026-03-30T10:27:17 1774866437

> Strawberry uses separate renderer processes for settings pages, modals, dropdowns, and other UI components.

Erm. Why? Svelte or not, is this program expected to run on anything but the latest and greatest hardware?

anematode · 2026-03-29T00:20:07 1774743607

Very nice :)

For a while I've been annoyed that esbuild, which is written in Go, eschews these APIs to detect changes in watch mode and instead continually polls the filesystem: https://github.com/evanw/esbuild/issues/1527#issuecomment-90.... It actually consumes quite a bit of battery, so I might fork it and apply this post's implementation!