Breaking: Area man discovers that python threaded performance sucks. Professiona...

MonkeyClub · on Nov 1, 2022

Haha yeah ok you're right, but I like how he wrote it up in a nice and methodical manner, and the post sort of gives a sense that he enjoyed doing it and writing it up, which transfers. It's a peculiar thing that benchmarking has, I think, it becomes enjoyable in itself.

bb88 · on Nov 1, 2022

... for some definition of methodical that ignores multiprocessing.

insanitybit · on Nov 1, 2022

Meh, mp is kind of a disgusting hack. Maybe it would have worked here but it's so easy to run into bullshit. Off the top of my head, if you want to share a TLS connection you're obviously fucked, if you use gRPC you need to specify an env flag to be fork safe, memory overhead that interacts poorly with reference counted GC, etc.

Again, in this example I suspect things would have worked, because it's trivial code, but I wouldn't actually bet on it because I've seen MP fuck up so many times historically. Like, maybe one of those modules actually creates a static connection pool that gets messed up on mp? Who knows?

So yeah, maybe the obscene hack that 'mp' is would have worked or just use tcl, which is sane and worked for them. A note about mp in the post would have made sense, I just won't blame someone for not wanting to deal with it.

bb88 · on Nov 2, 2022

> if you want to share a TLS connection you're obviously fucked...

Some would argue that if you're using threads, well then you're fucked from the start, very possibly including me.

Go gets around this using go-routines, but go isn't necessarily thread safe per se. Futher go also has it's own bag of bugs you have to be aware of when using it.

I think maybe the true answer here is rust.

tyingq · on Nov 1, 2022

Particularly since tcl threads don't share anything. It's one thread per interpreter.

nine_k · on Nov 1, 2022

I wonder why are these called "threads" then.

The very point of threads (as opposed to parallel processes) is the easy access to common memory and other resources.

tyingq · on Nov 2, 2022

Erlang is fairly famous for working in a similar way.

bb88 · on Nov 1, 2022

In fairness, that is in the pipe for python for a release soon. One interpreter per thread.

munch117 · on Nov 1, 2022

Not quite, subinterpreters can contain multiple threads each. But you're right in that if you limit yourself to one thread per subinterpreter, then you should be able to get good multicore scaling that way, when more fully fleshed out subinterpreters arrive in 3.12. The trick then becomes how to efficiently communicate between them. Interesting times ahead.

amelius · on Nov 1, 2022

You mean multiprocessing where communication goes either through serialization or through shared-memory where you have to use C-style programming and you lose all the advantages of e.g. a garbage collector and Pythonic datatypes?

bb88 · on Nov 1, 2022

Uh... did you like actually read his code? He has no shared state across threads. It's a perfect example for multiprocessing.

beiller · on Nov 2, 2022

Yeah the article felt intellectually honest and I enjoyed it, but was hoping for at least a footnote that multithreading is 1 solution. Async/await is another possible solution. There appears to be no shared state. But in fairness, python's state of multiprocessing, multithreading, and async/await just leaves too many options for a general language programmer to keep track of and perhaps it's just lack of knowing these things exist.

yellowapple · on Nov 2, 2022

The article does indeed mention:

> (And although the Python test script could run in multiple processes HammerDB is an application simulating multiple virtual users in a GUI and CLI environment requiring the close interaction and control of using multiple threads).

beiller · on Nov 4, 2022

Eh yes overlooked that but doesn't touch on async methods. I think there is an async postgres connector out there.

cmacleod4 · on Nov 3, 2022

This is the critical point that most commenters are missing or ignoring!