"By default, there is no exponential and mind blowing executable size increase."...

cbsmith · on Sept 29, 2014

I thought everyone stripped out debug info and shoved it in to a separate lzma compressed file these days...

If you use C++11/C++14 constexpr (and don't screw it up ;-), you really can be assured that the computations will complete at compile time at which point you shouldn't have very much left for the compiler to deal with at run time. I guess in theory you could have i-cache issues and such, but in practice I've yet to see that happen.

pkolaczk · on Sept 29, 2014

Even a not very complex program written in modern C++ is going to use a lot of map, set, vector, unordered_map, iterator instantiations for different types. Even if some of them can be deduplicated by a smart-enough-linker, that's still a significant amount of code, added for little or no gain (most of code is not on critical path).

corysama · on Sept 29, 2014

Does the program need to do the map, set, vector, iterator, ect... operations or does it not? If the work needs to be done then it needs to be instantiated in the code somehow. What is your proposed alternative?

dllthomas · on Sept 29, 2014

To me, the parent seemed to be complaining about multiple copies of effectively the same map (or set, or vector, ...) code for different types, not the presence of the map (or ...) code in the first place. The latter is obviously unavoidable. The former is an efficiency trade-off between the optimization opportunities afforded by specialization and inlining, and the smaller instruction footprint of keeping the code generic.

corysama · on Sept 29, 2014

So then, is the proposed alternative a library of containers and algorithms that are genericized via void pointers, memcpy sizes and function pointer predicates?

dllthomas · on Sept 29, 2014

I'd assume so. In principle you can get there without sacrificing type safety (http://en.wikipedia.org/wiki/Type_erasure). In practice, I don't know whether or not you can write type safe C++ that compiles that way.

cbsmith · on Sept 29, 2014

In truth, if you really care about it, you can take a policy based design approach to this and have explicit rules for when you may or may not do type erasure. It's definitely a bit of work, but you can actually get full compile_time type safety and yet avoid having duplicative code with various tricks like (this skips over a lot of the pain and isn't really how you'd do it... just illustrative):

    template <std::size_t N>
    struct raw {
        template <typename U>
        constexpr raw(U x, std::enable_if<.....>);

        template <typename U>
        constexpr operator U();
    };

    template <template ContainerType, typename T, std::enable_if<std::is_trivially_convertible<T, ....> > >
    struct somewhat_erased_container :  private ContainerType<raw<sizeof(T)>> {
    };

It's particularly easy to do with containers of pointers (of course).

dllthomas · on Sept 29, 2014

Very nice.

cbsmith · on Sept 29, 2014

Herb Sutter has prompted a variant of the Pimpl idiom that kind of goes this way, as have others. Linus was proudly showing how C was better than C++ by virtue of being able to do something much like this, but of course you can accomplish this in C++, of course there are some down sides that accompany it. Most importantly with C++ you can much more easily provide more robust compile time type safety with little or no runtime overhead.

corysama · on Sept 29, 2014

This seems like a nice article series on the subject http://akrzemi1.wordpress.com/2013/11/18/type-erasure-part-i...

Just getting started on it, it seems that the theme is that you can't have the same binary code work on different types without having a run-time indirection in there somewhere. But, you can use templates to keep that indirection convenient and type safe. And, you can structure your templates to only specialize a minimal amount of code necessary to interface your type into a non-specialized algorithm.

dllthomas · on Sept 29, 2014

Right, that looks like an exploration of what I was talking about (and at a skim, looks solid).

"you can't have the same binary code work on different types without having a run-time indirection in there somewhere"

That's certainly the case, but if you're passing by reference you already have an indirection, so that may or may not mean additional run-time overhead.

cbsmith · on Sept 29, 2014

If the code isn't critical path, it really doesn't matter too much whether it was for much gain or not, no? ;-)

pkolaczk · on Sept 29, 2014

Yeah, but that code still sits in memory, occupies space and takes non-zero time to load. It also takes free memory from the fragments of code that are on the critical path. Sometimes it doesn't matter (most desktop apps), sometimes it does (embedded software or system-level programming). Also, RAM usage of quite a lot of kinds of software is determined mostly by code size, not data size (e.g. word processors, spreadsheets, CAD software, IDEs, compilers) and startup-time is also determined typically by loading and dynamic linking time.

cbsmith · on Sept 29, 2014

For most of the common template instantiations, the standard runtime's shared library provides you with deduped versions that are likely already in memory, and most linkers are smart enough to make that dynamic linking very efficient.

True, for embedded systems this can still be a painful waste... But I'd argue if that is the case, you probably should be very explicit about using/not using STL libraries.

cageface · on Sept 29, 2014

Don't most operating systems memory-map executables in anyway? As long as the code is never paged it in shouldn't make any difference.

pkolaczk · on Sept 29, 2014

Agree, but if the code is used even only once, it gets loaded to memory. So if you've got a complex_generic_container<T1, T2>, its code gets loaded for every combination of types used.

cbsmith · on Sept 29, 2014

It gets loaded and then most likely paged out, particularly if your linker is profile guided (but even if not... rarely used code tends to get linked with other rarely used code).

It's fair that if you want to optimize startup times, that first page in hurts, although generally with C++ is's more about the symbol linking overhead than anything else (and there are a lot of strategies for minimizing that, but it sure isn't want happens by default).

cbsmith · on Sept 29, 2014

There is still link time if the types are shared across shared libraries (which happens a lot).

dllthomas · on Sept 29, 2014

There is also cache to consider.

cbsmith · on Sept 29, 2014

There is only cache to consider! ;-)

Other than cache, there are plenty of tricks that can cure a lot of other ills, but cache is precious.

dllthomas · on Sept 29, 2014

If you are paging, fixing that is a win on the order of fitting things in cache.

cbsmith · on Sept 29, 2014

Sure, but that's really just another form of "cache". Before you hit that, you generally get TLB stress on top of the cache line stress, so you're already hosted.

I have to say though, I've not seen too many cases where paging was caused by bloated object code size (lots of cases where it was caused by brutal bloat due to just outright bad code).

There was a time a couple of decades ago where it was a common issue with C++, but since then RAM has gotten a lot cheaper and C++ compilers/linkers have become much smarter (not to mention all compilers/linkers getting much smarter about code size). Sure, you can find pain points, but with a modicum of effort it's really hard to imagine a project really flapping outside the working set size purely due to the compiler/linker refusing to be drink from the water it has been lead to...

dllthomas · on Sept 29, 2014

Well, the original context seemed to more or less be someone saying "You're not really going to be paging anymore..." I suppose a more precise response on my part would have been "there are also other caches to consider", but meaning "processor caches" when saying "cache" with no other modifiers is fairly typical in my experience.

Upvoted anyway for precision.

cbsmith · on Sept 30, 2014

I will concede that my "there is _only_ cache to consider" statement was largely tongue in cheek.

In all serious though, swapping due to code bloat (as opposed to actual runtime bloat)... I haven't seen that in ages. Caches though... like a samurai with an absurdly sharp blade... they kill you almost every time, and usually you don't even know you're dead.