Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Sorry, but that's not actually true. There are no data races, the atomics prevent that (note that there are only one consumer and one producer)

Regarding the style, it follows the "almost always auto" idea from Herb Sutter



If you enforce that the buffer size is a power of 2 you just use a mask to do the

    if (next_head == buffer.size())
        next_head = 0;
part


If it's a power of two, you don't need the branch at all. Let the unsigned index wrap.


You ultimately need a mask to access the correct slot in the ring. But it's true that you can leave unmasked values in your reader/writer indices.


Interesting, I've never heard about anybody using this. Maybe a bit unreadable? But yeah, should work :)



Nice one!


I believe ConcurrencyKit's impl does this.

https://github.com/concurrencykit/ck/blob/master/include/ck_...


Indeed that's true. That extra constraint enables further optimization

It's mentioned in the post, but worth reiterating!


Nice!

Should be able to push it more if

* we limit data shared to an atomic-writable size and have a sentinel - less mucking around with cached indexes - just spinning on (buffer_[rpos_]!=sentinel) (atomic style with proper sematics, etc..).

* buffer size is compile-time - then mod becomes compile-time (and if a power of 2 - just a bitmask) - and so we can just use a 64-bit uint to just count increments, not position. No branch to wrap the index to 0.

Also, I think there's a chunk of false sharing if the reader is 2 or 3 ahead of the writer - so performance will be best if reader and writer are cachline apart - but will slow down if they are sharing the same cacheline (and buffer_[12] and buffer_[13] very well may if the payload is small). Several solutions to this - disruptor patter or use a cycle from group theory - i.e. buffer[_wpos%9] for example (9 needs to be computed based on cache line size and size of payload).

I've seen these be able to pushed to about clockspeed/3 for uint64 payload writes on modern AMD chips on same CCD.


This was, in fact, mentioned in the article.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: