Hacker Newsnew | past | comments | ask | show | jobs | submit | dathinab's commentslogin

> reduces the amount of dead boilerplate code other languages struggle with.

given that most of the thinks added seem more inspired by other languages then "moved over" from F# the "other languages struggle with" part makes not that much sense

like some languages which had been ahead of C# and made union type a "expected general purpose" feature of "some kind":

- Java: sealed interfaces (on high level the same this C# features, details differ)

- Rust: it's enum type (but better at reducing boilerplate due to not needing to define a separate type per variant, but being able to do so if you need to)

- TypeScript: untagged sum types + literal types => tagged sum types

- C++: std::variant (let's ignore raw union usage, that is more a landmine then a feature)

either way, grate to have it, it's really convenient to represent a `TYPE is either of TYPES` relationship. Which are conceptually very common and working around them without proper type system support is annoying (but very viable).

I also would say that while it is often associated with functional programing it has become generally expected even if you language isn't functional. Comparable to e.g. having some limited closure support.


> big bag of stuff, with no direction.

also called general purpose, general style langue

> that still can't really be done in C#

I would think about it more as them including features other more general purpose languages with a "general" style have adopted then "migrating F# features into C#, as you have mentioned there are major differences between how C# and F# do discriminated sum types.

I.e. it look more like it got inspired by it's competition like e.g. Java (via. sealed interface), Rust (via. enum), TypeScript (via structural typing & literal types) etc.

> Was a functional approach really so 'difficult'?

it was never difficult to use

but it was very different in most aspects

which makes it difficult to push, sell, adapt etc.

that the maybe most wide used functional language (Haskel) has a very bad reputation about being unnecessary complicated and obscure to use with a lot of CS-terminology/pseudo-elitism gate keeping doesn't exactly help. (Also to be clear I'm not saying it has this properties, but it has the reputation, or at least had that reputation for a long time)


"reputation about being unnecessary complicated and obscure to use with a lot of CS-terminology/pseudo-elitism gate keeping doesn't exactly help"

Probably more this than any technical reason. More about culture and installed view points.

I don't want to get into the objects/function wars, but do think pretty much every technical problem can be solved better with functions. BUT, it would take an entire industries to re-tool. So think it was more about inertia.

Inertia won.


it's basically `union <name>([<type>],*)`, i.e.

=> named sum type implicitly tagged by it's variant types

but not "sealed", as in no artificial constraints like that the variant types need to be defined in the "same place" or "as variant type", they can be arbitrary nameable types


they but "minimal dump DRM" into their client (supposedly, from people which leaked the linked source code, no me)

easy to circumvent

but would fall under "circumventing security protections"/"hacking their API"/etc. And due to the sometimes very unreasonable laws the US has in that area they can use that to go after anyone providing a workaround.

Through that maybe won't work well for the EU, I'm not sure how much the laws have been undermined in recent years but we had laws which made it explicitly legal to circumvent DRM iff it's for the sake of producing compatibility (with some caveats).


I think the law just says that it's legal to circumvent DRM for compatibility - they don't define DRM or compatibility. It's one of those vague laws that you only know if it matters when it gets tested in court.

yes, but ;)

tbh. it sounds a bit like he himself was somewhat of an internet troll during that time

and it might not be quite the same definition of "internet troll" I tend to us

like it sounds a lot like a definition of "toll" like

- "very vocal, convicted of their opinion, non stop discussing(/neutral) potentially for the sake of discussing(/neutral)"

while I tend to associate more something like

- "intentional annoying, non stop discussing for the sake of annoying people, often using dishonest discussion techniques, potentially outright harassment"


> resulting VM outperforms both my previous Rust implementation and my hand-coded ARM64 assembly

it's always surprising for me how absurdly efficient "highly specialized VM/instruction interpreters" are

like e.g. two independent research projects into how to have better (fast, more compact) serialization in rust ended up with something like a VM/interpreter for serialization instructions leading to both higher performance and more compact code size while still being cable of supporting similar feature sets as serde(1)

(in general monomorphisation and double dispatch (e.g. serde) can bring you very far, but the best approach is like always not the extrem. Neither allays monomorphisation nor dynamic dispatch but a balance between taking advantage of the strength of both. And specialized mini VMs are in a certain way an extra flexible form of dynamic dispatch.)

---

(1): More compact code size on normal to large project, not necessary on micro projects as the "fixed overhead" is often slightly larger while the per serialization type/protocol overhead can be smaller.

(1b): They have been experimental research project, not sure if any of them got published to GitHub, non are suited for usage in production or similar.


A new Go protobuf parser [1] made the rounds here eight months ago [2] with a specialized VM that outperforms the default generated protobuf code by 3x.

[1]: https://mcyoung.xyz/2025/07/16/hyperpb/

[2]: https://news.ycombinator.com/item?id=44591605


I missed that, that could come in handy. Thanks!

It doesn't make sense to me that an embedded VM/interpreter could ever outperform direct code

You're adding a layer of abstraction and indirection, so how is it possible that a more indirect solution can have better performance?

This seems counterintuitive, so I googled it. Apparently, it boils down to instruction cache efficiency and branch prediction, largely. The best content I could find was this post, as well as some scattered comments from Mike Pall of LuaJIT fame:

https://sillycross.github.io/2022/11/22/2022-11-22/

Interestingly, this is also discussed on a similar blogpost about using Clang's recent-ish [[musttail]] tailcall attribute to improve C++ JSON parsing performance:

https://blog.reverberate.org/2021/04/21/musttail-efficient-i...


Yeah, Clang's musttail and preserve_none make interpreter writing much simpler, just make yourself a guaranteed tail call opcode dispatch method (continuation passing style works a treat here), stitch those together using Copy-and-Patch and you have yourself a down and dirty jit compiler.

> It doesn't make sense to me that an embedded VM/interpreter could ever outperform direct code. You're adding a layer of abstraction and indirection, so how is it possible that a more indirect solution can have better performance?

It is funny, but (like I’ve already mentioned[1] a few months ago) for serialization(-adjacent) formats in particular the preferential position of bytecode interpreters has been rediscovered again and again.

The earliest example I know about is Microsoft’s MIDL, which started off generating C code for NDR un/marshalling but very soon (ca. 1995) switched to bytecode programs (which Microsoft for some reason called “format strings”; these days there’s also typelib marshalling and WinRT metadata-driven marshalling, the latter completely undocumented, but both data-driven). Bellard’s nonfree ffasn1 also (seemingly) uses bytecode, unlike the main FOSS implementations of ASN.1. Protocol Buffers started off with codegen (burying Google user in de/serialization code) but UPB uses “table-driven”, i.e. bytecode, parsing[2].

The most interesting chapter in this long history is in my opinion Swift’s bytecode-based value witnesses[3,4]. Swift (uniquely) has support for ABI compatibility with polymorphic value types, so e.g. you can have a field in the middle of your struct whose size and alignment only become known at dynamic linking time. It does this in pretty much the way you expect[5] (and the same way IBM’s SOM did inheritance across ABI boundaries decades ago): each type has a vtable (“value witness”) full of compiler-generated methods like size, alignment, copy, move, etc., which for polymorphic type instances will call the type arguments’ witness methods and compute on the results. Anyways, here too the story is that they started with native codegen, got buried under the generated code, and switched to bytecode instead. (I wonder—are they going to PGO and JIT next, like hyperpb[6] for Protobuf? Also, bytecode-based serde when?)

[1] https://news.ycombinator.com/item?id=44665671, I’m too lazy to copy over the links so refer there for the missing references.

[2] https://news.ycombinator.com/item?id=44664592 and parent’s second link.

[3] https://forums.swift.org/t/sr-14273-byte-code-based-value-wi...

[4] Rexin, “Compact value witnesses in Swift”, 2023 LLVM Dev. Mtg., https://www.youtube.com/watch?v=hjgDwdGJIhI

[5] Pestov, McCall, “Implementing Swift generics”, 2017 LLVM Dev. Mtg., https://www.youtube.com/watch?v=ctS8FzqcRug

[6] https://mcyoung.xyz/2025/07/16/hyperpb/


You could probably rephrase almost any enum dispatch whatsoever as a "bytecode interpreter" of a sort, especially if run recursively to parse over some kind of sequence. If bytecode helps you achieve a more succinct representation of some program code than the native binary representation for your architecture, it makes sense that this could be faster in some cases.

> Also, bytecode-based serde when?

Not in serde itself, but people have been experimenting with serde alternatives that are bytecode based. Nothing stable as far as I know, just experiments.

One early experiment is described in https://sdr-podcast.com/episodes/a-different-serde/ , which used a FORTH inspired stack based VM generated by derive macros at compile time (iirc).

Another experiment is https://lib.rs/crates/facet which is a more general derives to generate compile time introspection to generate metadata tables approach.


sure probably even git a bit less,

but I still would recommend 6 GiB.

no matter of the OS

the problem here is more the programs you run on top of the OS (browser, electron apps, etc.)

realistic speaking you should budged at least 1GiB for you OS even if it's minimalist, and to avoid issues make it 2GiB of OS + some emergency buffer, caches, load spikes etc.

and 2GiB for your browser :(

and 500MiB for misc apps (mail, music, etc.)

wait we are already at 4.5 GiB I still need open office ....

even if xfc would safe 500 MiB it IMHO wouldn't matter (for the recommendation)

and sure you can make it work, can only have one tab open at a time, close the browser every time you don't need it, not use Spotify or YT etc.

but that isn't what people expect, so give them a recommendation which will work with what they expect and if someone tries to run it at smaller RAM it may work, but if it doesn't it at least isn't your fault


neither, they didn't measure anything

they compared the Ubuntu minimal recommended RAM to Windows absolute minimal RAM requirements.

but Windows has monetary incentives (related to vendors) to say they support 4GiB of RAM even if windows runs very shitty on it, on the other had Ubuntu is incentivized to provider a more realistic minimum for convenient usage

I mean taking a step back all common modern browsers under common usage can easily use multiple GiB of memory and that is outside of the control of the OS vendor. (1)

As consequence IMHO recommending anything below 6 GiB is just irresponsible (iff a modern browser is used) _not matter what OS you use_.

---

(1): If there is no memory pressure (i.e. caches doesn't get evicted that fast, larger video buffers are used, no fast tab archiving etc.) then having YT playing likely will consume around ~600-800 MiB.(Be aware that this is not just JS memory usage but the whole usage across JS, images, video, html+css engine etc. For comparison web mail like proton or gmail is often roughly around 300MiB, Spotify interestingly "just" around 200MiB, and HN around 55MiB.


putting aside that MS is too huge to even just know about the names of your senior engineers across the globe and that the mail might have gone directly to spam

there is still the issue that this might have been classified as "a crazy letter"

a lot of the article reminds me of people which might (or might not) have competency but insists they know better and are very stubborn and very bad and compromising on solutions. The subtext of the articles is not that far afar from "everyone does everything wrong, I know better, but no one listens to me". If you frame it like that it very much sounds like a "crazy" letter.

Strictly speaking it reminds me a lot about how Pirate Software spoke about various EA related topics. (Context: Pirate Software was a streamer and confidence man who got complemented up due to family connections and "confidently knew" everything better while having little skill or contributions and didn't know when to stop having a "confidently bad" opinion. Kinda sad ending given that he did motivate people to peruse their dream in game design and engage themself for animal protection.).

Or how I did do so in the past. Appearing very confident in your know-how ironically isn't always good.

And in case it's not clear: The writing reminding me of it and having patters of someone trying to create a maximally believable writing to make MS look bad doesn't mean that he behaves like that or that the writing is intended to be seen that way.

It's more about how we have a lot of "information" which all look very believable, but in the end miss means to both: Verify many of the named "facts". And, more importantly, judge the sentiment/implicit conveyed information.

Especially if we just take the mentioned "facts" without the implicit messages and ignore the him<->management communication issues I would guess a lot of that is true.


Which shifts things from "before high school" to "in primary school and then gradual introduce aspects on it".

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: