It's less that the OS is written in C, but more that literally every single interface to literally anything is either written in C, or isn't meant to be used by multiple languages. C is the de facto standard for interacting with anything not written in your language. Window management, I/O, graphics, everything has a C interface. There aren't even any alternatives (besides just sending a list of bytes through a pipe or something, which noone wants to do for obvious reasons).
> C is the de facto standard for interacting with anything not written in your language.
Not necessarily. C is how your language interacts with the OS if you use the OS's C library. But not all languages do. Go doesn't.
The only thing you have to do to interact with the OS is make system calls. You can do that without going through the C library. It might be a PITA to do it if you're not working on Go for Google, but if that's the real problem, that's what the author of this article should have complained about. (It's actually less of a PITA on Linux because the syscall interface is implemented using software interrupts. Windows is more of a PITA because the OS goes out of its way to make its syscall interface look like a C interface.)
> The only thing you have to do to interact with the OS is make system calls. You can do that without going through the C library.
If by "the OS" you mean "Linux". On Windows, calling syscalls is explicitly unsupported and Microsoft will break you mercilessly by changing syscall IDs between versions. On macOS and iOS, libSystem.dylib is the only supported way to call syscalls; syscalls are SPI and Apple will break you if you try to call them directly. BSDs likewise discourage you from calling syscalls.
The "syscalls are stable API" principle that Linux is famous for is a weird Linux-specific thing; other OS's don't do it.
To enable the same kind of container image portability between kernel versions that Linux has, Microsoft has decided to stabilise the kernel syscall ABI.
> C is how your language interacts with the OS if you use the OS's C library. But not all languages do. Go doesn't.
...on Linux. Pretty much everywhere else, libc is the way to interact with the kernel; even Go goes through libc on ex. Darwin and OpenBSD, because Linux presenting a stable kernel ABI is actually pretty unusual.
I'm less familiar with the Microsoft ecosystem, but I'm pretty sure it is; Microsoft named their libc MSVCRT.DLL, but it's still the canonical system ABI, and NT, AIUI, actively shuffles system calls between builds to prevent people even trying to use them directly.
The rest, and your general point, are fair, though; there's nothing preventing the creation of a system that doesn't work like this.
No, msvcrt.dll isn't the system ABI. It's officially not even part of the OS and you aren't supposed to rely on it existing. Kernel32.dll is the primary entry point for OS things, and it's distinctly not a libc.
> Linux presenting a stable kernel ABI is actually pretty unusual.
Did you mean "portable"? Because the ABI to the Linux kernel system calls is very stable. Literally the 32-bit syscall interface has not changed (except by addition) in 30 years.
The darwin kernel is also pretty stable, except Apple moves so fast from one architecture to the next that they drop support for old ISAs pretty fast. I predict they'll drop support for x86 altogether in ~5 years.
> Did you mean "portable"? Because the ABI to the Linux kernel system calls is very stable.
I think the GP meant that the Linux syscall ABI is indeed very stable and always has been, as you say, but that the syscall ABI of other OSs is not. So Linux is "unusual" in that sense, not in the sense that its syscall ABI is stable now but wasn't in the past.
AFAICT, SunOS (Solaris), BSD, and Darwin have stable kernel ABIs for the core (UNIX) system calls. I've never used the Mach syscalls on Darwin, but I can imagine those changing more often, because Apple does that. I would imagine that Windows supports 32-bit syscalls still pretty well...but I stopped using Windows in 2002, so tbh, not sure.
> The official API for Illumos and Solaris system calls requires you to use their C library, and OpenBSD wants you to do this as well for security reasons (for OpenBSD system call origin verification). Go has used the C library on Solaris and Illumos for a long time, but through Go 1.15 it made direct system calls on OpenBSD and so current released versions of OpenBSD had a special exemption from their system call origin verification because of it.
read is unlikely to change anytime soon on macOS but Apple can and does change its numbering for lesser-used but still very important system calls all the time.
They also change the syscall abi without touching the numbering. gettimeofday(2) is a pretty famous one there.
Then there’s also the Linux-specific issue of vDSO, which are not kernel code and to which the article applies in full (see: “debugging an evil ho runtime bug”).
Even Go gave up and switched to using libc for OpenBSD[1]. You can use syscalls on Linux because Linux, lacking a blessed libc implementation, guarantees stability of syscalls as an interface. The BSDs, afaik, do not, so circumventing libc is liable to render your binary forwards-incompatible with future kernel versions.
That isn't just about the lack of guaranteed stability, as far as I understand it is considered a security feature. System calls that are not made through the libc may be actively blocked. So an attacker can't just use an exploit to inject a system call into a process, the call has to pass through the libc wrapper and in combination with ASLR that wont be trivial.
A better way of accomplishing this would have been to randomize the syscall numbers on a per-process basis, and map a "syscall number translation table" into the process when loading the executable.
Again, there's more than the OS kernel that a programming language will need to interact with. You don't go through syscalls to make an X11 window or read keyboard/mouse inputs or draw graphics. There are thousands of utilities that are meant to be usable cross-language, but in actuality use the C ABI (well, at least one of the 176 C ABIs there are).
>> C is the de facto standard for interacting with anything not written in your language.
> Not necessarily. C is how your language interacts with the OS if you use the OS's C library. But not all languages do. Go doesn't.
Go doesn't interact with anything written in a different language? I find this really hard to believe - it would result in Go not having a GUI library, not being able to use PostgreSQL, not being able to start external processes (like shells), etc.
I'm pretty sure that Go can do all those things, hence Go does interact with libraries written in a different language.
> but more that literally every single interface to literally anything is either written in C, or isn't meant to be used by multiple languages. C is the de facto standard for interacting with anything not written in your language.
Even that is not the real complaint, since the alternative to this (N different FFI-style APIs) is madness.
The heart of the complaint is "great, we have effectively a single interface to almost everything, which is probably a good idea, except that the interface sucks".
I don't disagree, I see an expectation on the part of driver/API providers that you can write (probably in C) and adapter layer that can talk "glorious language" on the input, and then do the naughty bits and talk to the C API on the output.
My position is that this is less of a burden than writing something that goes from the "glorious language" into machine code.
Understanding why that is less of a burden is gained by writing a compiler where you go right from "expressing what you want" to "machine code that does that".
> My position is that this is less of a burden than writing something that goes from the "glorious language" into machine code.
It's only less of a burden if you're guaranteed to have a C compiler hanging around any time the C ABI breaks. If you aren't guaranteed to have a C compiler hanging around any time the C ABI breaks, then you are probably just as well off generating C-ABI compatible machine code from an ABI description written in "glorious language" and updating it when the C ABI breaks.
Non self-hosting languages distributed as source code can just lean on the C compiler for everything.
It's also quite feasible to target Linux without worrying about the C ABI, as the Linux system call interface is famously stable, but that is not true of any BSD.
> My position is that this is less of a burden than writing something that goes from the "glorious language" into machine code.
I disagree. Machine code has a small stable surface area. Compiling to a bootable image for a single-tasking machine is easier than compiling to a well-behaved userspace unix program that operates the way users expect.
I mean, a C interface is better than a machine code interface. The problem with that is that, unless you include a C compiler (or at least a thing that understands the 176 C ABIs) with your language, you can't use it. (not that machine code would be better at that, but I'm sure that if people actually intentionally attempted to make a global cross-platform ABI, they'd easily make something much better than C)
You'd obviously prefer calling into things made in your own language than some other. But, from any other language you'd much prefer a C interface over having to explicitly describe which registers & stack slots to put things in and read from.
I did this once trying to mitigate Go's FFI performance cost. It wasn't great but it wasn't awful either. If I had to do it again I'd generate the assembly trampolines.
There are a couple solid summaries in that thread based on my experience going through that exercise - thanks for the link
In case anybody wants to go down the rabbit hole of shit you should definitely not use in production (but we did anyway), this was my starting point: https://words.filippo.io/rustgo/
I did not, however, `no_std` or avoid heap allocation on the Rust side. Everything worked great, including running a multithreaded tokio runtime.
I think that's besides his point, which I take is: Replace C with an alternative and people will complain of similar problems, since all abstractions are leaky abstractions to some extent. Or in other words, since you can never be immune to this "leaky" problem, you will inevitably have similar kinds of problems despite using an alternative. Which is why you'll find every programming language has critics.
Also, keep in mind having alternatives isn't necessarily better, as it introduces a problem itself: it complicates things (the article even mentions a problem of this sort). Does having 100 programming languages, with dozens of OS's, each all doing mostly the same sort of things in different ways - simple? Might the programming world be simpler if there was 1 programming language/OS/way of doing something, rather than having 100s of alternatives? (This is just food for thought to demonstrate the point, not something I'm particularly advocating)
> Window management, I/O, graphics, everything has a C interface.
You say that like it's a bad thing. The situation has gotten much worse for third-party languages these days, with frameworks being written Javascript, Swift, etc., which are hard to bind to without fundamental impacts on your language design.
Yes, it is possible to make worse ABIs than C's. That's not a surprise. Doesn't mean C's is automagically the best possible. Something that, at the very least, doesn't have varying sizes for the same type between architectures/OSes, and precisely defines structure layout/padding, would be vastly better. (not saying that's a realistic thing to get at this point, but doesn't change the fact that it'd be much nicer.)
I’m not saying it’s the best thing, but I’m not sure you can get much better. A lot of things defined by the C ABI are things you’d have to do anyway if you’re compiling to native code (structure alignment, where’s the stack, what’s called-save, callee-save, where do you find the return value, etc.) Targeting the C ABI’s set of choices isn’t that different from targeting a different set of choices.
And there are gratuitous differences in C ABIs, for sure. But often there’s good reasons. X86 had no register arguments because few registers; same reason it had no thread pointer. RISC architectures had weird limitations on unaligned access, etc. affecting struct layout.
Maybe I’m being dense. What does a good alternative look like? CORBA?
Between architectures you'd indeed need quite a few differences, but for the same architecture it'd be nice if things more or less agreed on things.
Specifically, I think it'd be very reasonable & possible to have an ABI that has various floats & integers (specified width of course), and pointers & structures, that has consistent conservative padding on all architectures and precisely one way any given structure is laid out in memory (ok, maybe two, with LSB & MSB integers; but I think that's literally all the variation there is (and MSB is nearly dead too)).
Not the most efficient thing on all architectures, but for language inter-communication it shouldn't be too bad to lose a couple nanoseconds here and there, and I'd guess it'd outweigh the alternative of needing more generic (and thus, less optimized) code handling each individually.
The various C ABIs definitely do, though. And you generally know which C ABI you need to interact with (the __int128_t example from the article just seems like a bug -- luckily use of that type is rare).
You should think of C as a front-end for the computing platform you are targeting. The platform has an ABI, which you need to use to interact with the system components. So because every third-party has to speak that ABI to interface with the platform, they might as well use that ABI to interface with each other too. You can of course build your own little world within a platform, with all your binaries speaking a different ABI to each other. And that might have some advantages, but most people find those advantages outweighed by just using the same ABI as the platform.
A lisp machine, by definition, runs lisp. The ABI problem only exists in environments where multiple programming languages exist.
I doubt much non-.NET software runs on that C# OS either.
If you want multiple programming languages to be able to sanely run within a single OS (and do more than just pure computation), though, you have the ABI problem, and this is what the whole discussion is about.