Concepts vs SFINAE: Compilation time comparison

So, I’ve first seen this as a question on reddit and I found it very interesting. I mean, it’s something that I’ve never asked myself and always thought that Concepts are, by default, faster. After all, Concepts are new additions to the language and SFINAE implementations are more or less, hacks.

By the way, if you don’t know what Concepts are, I’ve mentioned them here and I plan on making a post in the future about it, so don’t worry, I got you.

A gentleman which goes by the name jwakely on reddit provided a pretty nice answer in which he explains why Concepts should be actually faster than SFINAE when it comes to compile time.

However, aside from the explanation that I feel is pretty simple to understand and easy to agree with, I think it’s nice to also have some data and see real numbers. That’s why I did a little bit of research on my own and created some charts.

Let’s see first, what the gentleman mentioned above had to say about this and then we will go into a little bit of data to analyze.

Why concepts should be faster

Basically, the explanation focused on some points:

  • With SFINAE, if you use logical operators(e.g. &&, ||) , all the operands have to be instantiated, while with Concepts that is not the case
enable_if_t<is_foo_v<T> && is_bar_v<T>>; // both is_foo_v<T> and is_bar_v<T> have to be instantiated even if the first is false

// Concepts
requires is_foo_v<T> && is_bar_v<T>; // the operation is short circuited - if the first is false, second is not instantiated
  • If you try to short circuit the SFINAE version with conjunction(in this case), then the compiler has to instantiate the conjunction also which means there is more work to do
enable_if_t<conjunction_v<is_foo<T>, is_bar<T>>>; // compiler has to instantiate a specialization of enable_if, conjunction, and is_foo
  • If you need to use the constraint on one member of an overloaded set, you need to also negate the constraint on other members when using SFINAE. With Concepts, that is not the case.
template<typename T>
enable_if_t<conjunction_v<is_foo<T>, is_bar<T>>>
foo() { }

template<typename T>
enable_if_t< ! conjunction_v<is_foo<T>, is_bar<T>>>
foo() { }

// Concepts
template<typename T>
requires is_foo_v<T> && is_bar_v<T>
foo() { }

// less work for the compiler since there's no constraint
template<typename T>
foo() { }

Pretty solid explanation and easy to understand.

But now, let’s see some numbers :).

How I got the data

So, in order to actually get some numbers for this, I’ve used a library called metabench which you can find here and my own example.

The configuration used is as follows.

# CMakeLists.txt
cmake_minimum_required(VERSION 3.22)



list(APPEND CMAKE_MODULE_PATH "/home/work/metabench")

# Add new datasets
# will generate n from [0] to (1000) with a step of 100 - we will have values 0,100,200,300...900 for each example(sfinae and concepts)
metabench_add_dataset(sfinae "sfinae.cpp.erb" "(0...1000).step(100)")
metabench_add_dataset(concepts "concepts.cpp.erb" "(0...1000).step(100)")

# Add a new chart
# This will generate the chart (with proper html file)
metabench_add_chart(chart DATASETS sfinae concepts)

Now for the implementation.

template <class T, std::enable_if_t<std::is_arithmetic_v<T> && std::is_integral_v<T>, bool> = true>
void doSomething(T var)

template <class T, std::enable_if_t<!std::is_arithmetic_v<T> || !std::is_integral_v<T>, bool> = true>
void doSomething(T var)

// Concepts
template <class T>
    requires std::is_arithmetic_v<T> && std::is_integral_v<T>
void doSomething(T var)

template <class T>
void doSomething(T var)

For the actual “test cases” in main, I took note of one observation made on the metabench‘s page:

A good technique to make sure the results of a benchmark are not inside the noise is to reduce the relative uncertainty of the measurement. This can be done by increasing the total compilation time of the measured block, by repeating the same thing (or a similar one) multiple times.

So I had this as the test case, in main for both concepts and sfinae.

int main()
    #if defined(METABENCH)
    <% (0..n).each do |i| %>
        struct A<%= i %> {};
        struct B<%= i %> {};
        struct C<%= i %> {};
        struct D<%= i %> {};
        struct E<%= i %> {};

        A<%= i %> a<%= i %>;
        B<%= 1 %> b<%= i %>;
        C<%= 1 %> c<%= i %>;
        D<%= 1 %> d<%= i %>;
        E<%= 1 %> e<%= i %>;

        doSomething(a<%= i %>);
        doSomething(b<%= i %>);
        doSomething(c<%= i %>);
        doSomething(d<%= i %>);
        doSomething(e<%= i %>);
    <% end %>

Basically, what the above does is: first, define data types(structs), then create objects from these data types, then use the template functions with these objects.

For example, for n=0, this snippet will generate:

int main()
    // data types
    struct A0 {};
    struct B0 {};
    struct C0 {};
    struct D0 {};
    struct E0 {};

    // objects
    A0 a0;
    B0 b0;
    C0 c0;
    D0 d0;
    E0 e0;
    // calls in order to instantiate templates

So, in this case, we have 5 data types(A0…E0), 5 objects(a0…e0) and 5 calls in order to instantiate the templates.

However, for n=100, we will have 500 data types, 500 objects and 500 calls in order to instantiate the templates.

The actual results

So here are the results from the above “experiment”.

As you can see, with more types, the difference between Concepts and SFINAE compilation times gets bigger.

Final thoughts

I think it’s pretty clear already that Concepts are faster than SFINAE when it comes to compile time.

It was clear enough from the explanation but now you also have some data to back it up.

One thing to take from this is that if you have the possibility to use Concepts, just use them. Don’t get tangled up in SFINAE shenanigans.

2 thoughts on “Concepts vs SFINAE: Compilation time comparison”

Leave a Reply

%d bloggers like this: