Okay, this one has me laughing out loud. Of COURSE Microsoft doesn't like fork()... Windows pretty much can't do it. I'll admit, there have been a lot of times I wish there was a more streamlined way to spawn processes on Linux (particularly daemons) but when I don't have fork() I always end up missing it. I'd take this paper a lot more seriously if it came from someone with a less obvious bias.
As Linux developer and Windows hater, I agree with Microsoft. fork() is a hack.
Of course, all Windows APIs are terrible, but that doesn't make complaints about fork() any less legitimate. The concept of Establishing empty processes, instead of cloning yourself, is much more sane.
After all, the use of fork() is 99% of the time just to call execve(), and anything done in between is just to clean up the mess from fork(). Having a dedicated way to just create processes in a controlled fashion would have been better there. And, the other 1% is usually cases where pthread should have been used instead.
Cleaning up your own process between fork and exec is hard. Several programs resort to terrible hacks like force-closing everything except file IDs 0,1,2 in a loop. Or they look into their /proc directory to discover whichnfile IDs exist, which is only marginally better. But when your process is a house of cards built on third party libraries with their own minds, there are not a lot of other options.
Use O_CLOEXEC everywhere (even third party libs). It's really annoying, but necessary. Means you need to use accept4(), dup3(), popen with an additional "e" (of course all of that needs to be feature tested, during compilation/runtime).
The catch is that you may not be able to control 3rd party libraries enough to be be able to do all that. Thus all these annoying hacks. To me, the complexity of using fork() and the race conditions around pid reuse are the worst design problems of POSIX systems.
Win32 has the opposite semantics, that O_CLOEXEC is the default semantics and the app has to request the opposite if it wants it, and this causes problems too. There should have been two flags and the application should have to specify one on every handle-/fd-creating system call. Hindsight is 20/20.
> Of course, all Windows APIs are terrible, but that doesn't make complaints about fork() any less legitimate. The concept of Establishing empty processes, instead of cloning yourself, is much more sane.
I like the ease with which you can pass resources and data to the forked child from the parent, though. Otherwise I'd have to do a lot of serialiation and deserialization, or use shared memory, or unix sockets to pass fds, all of which also has it's gotchas and is way more complicated and error prone.
> And, the other 1% is usually cases where pthread should have been used instead.
Ummmm. No. Threads are a much harder API to get right. They can work in this area, but that's not the same as saying they're right for all/most cases in this area.
I think a sizable part of that remaining 1% (if it is that low) are programs that leverage fork as the very powerful right tool for the job. Many of those also happen to be widely-used programs crucial for the operation of web services and large-data-set processing.
> Ummmm. No. Threads are a much harder API to get right.
Ummmm. No. Threading is not a hard API to get right. It's very simple: You get a new executing thread in the same memory space. You can create them whenever you like without any side-effects. Now, don't trample on your memory. Read all you want from anywhere. If you want to write to shared memory, ensure both reads and writes are behind a mutex, or learn about atomics.
Fork(), on the other hand, is much trickier. Sure, you get a cloned memory space so you can trample all you want, but now you have to establish some form of IPC (which might itself end up requiring threading), and if you didn't fork() as the first thing in your process, you end up inheriting all sorts of state that you do not want. Threads and locks, for example, are now in limbo (depending on your unix flavor of choice), and you likely have a bunch of fd's that you did not want.
I cannot really think of any legitimate use-cases for fork() without exec(). There are legitimate use-cases for multi-process designs, but such designs are severely inconvenienced by fork(), as all they wanted to do was to start processes without inheriting state.
I also certainly cannot see any sensible argument for threading being harder than fork(), especially if you're just using it as a drop-in replacement where there will be no shared state after invocation outside of explicitly created communication channels.
Concurrency bugs can be eliminated with state of the art static analysis (see Rust, Pony) - with the exception of deadlocks, which you can easily introduce with multiple processes as well.
> I also certainly cannot see any sensible argument for threading being harder than fork(), especially if you're just using it as a drop-in replacement where there will be no shared state after invocation outside of explicitly created communication channels.
Very well said. It's gets no simpler than this. I think all too often, people try and complicate things where they don't need to be. Always do the SIMPLEST thing that works well.
Well, couldn't. Whatever they're doing with LXSS and picoprocesses seems to be good enough.
I don't run Windows so I'm far from the most biased person but frankly, on the surface the fork/exec thing really does seem unnecessary and weird in the modern world, where we've come up with better ways to do concurrency than just raw threads and processes anyways.
> This syscall underlies all process and thread creation on Linux. Like Plan 9’s rfork() which preceded it, it takes separate flags controlling the child’s kernel state: address space, file descriptor table, namespaces, etc. This avoids one problem of fork: that its behaviour is implicit or undefined for many abstractions. However, for each resource there are two options: either share the resource between parent and child, or else copy it. As a result, clone suffers most of the same problems as fork (§4–5).