Run time improvements

What is the run time ?

Well, that is pretty simple. Actually, the run time represents the span of time from when an action starts until that action ends, while your application is running.
This may also be known as CPU time: how much it takes for the CPU to perform the action.

Let’s say that you have some logging to do in your application.
You are saying LOG_DEBUG << "Print this log text into a log file.";. This is an action that you are doing at run time but, since this is a debug action, you do not want it to take very much time.
What do you think will happen if the printing of the log takes, let’s say 1 second(which is a huge amount of time for printing a log). You are losing 1 second for every log of this kind.
This will slow down other actions that should matter to the user. If the user logins to you application but you have to print 60 logs, then the user has to wait 1 minute more than normal, just because of your actions in the code that are taking 1 second each.

You need to optimize the run time. In this case, for the logging process.

Here are some good practices to optimize the run time of your code.

Pass-by-reference

Always consider passing by reference whenever possible. This will reduce the temporary objects number, thus will reduce the CPU time needed.
NOTE: Do not pass-by-reference primitive types. This is because memory usage is usually higher for references than for copies and the run time you gain is nothing in this case.
You can always see the difference by checking what the assembler generates.
Let’s take an example:

#include <iostream>
#include <string>

struct MyStruct
{
    std::string myString;
};

void getMyStruct(MyStruct str)
{
};

int main()
{
	MyStruct str;
	getMyStruct(str);
}

You can check the assembly generated from this snippet of code using the handy tool provided by GCC, godbolt.
If you check it out, you will have this output for the call at line 15: getMyStruct(str);

        lea     rdx, [rbp-80]
        lea     rax, [rbp-48]
        mov     rsi, rdx
        mov     rdi, rax
        call    MyStruct::MyStruct(MyStruct const&)
        lea     rax, [rbp-48]
        mov     rdi, rax
        call    getMyStruct(MyStruct)
        lea     rax, [rbp-48]
        mov     rdi, rax
        call    MyStruct::~MyStruct() [complete object destructor]

As you can see, the copy constructor is called before “getMyStruct(MyStruct)” function call happens and the destructor is called right after the function finishes.
Now, if we change the “getMyStruct(MyStruct)” signature to : void getMyStruct(const MyStruct&), this output is shown by godbolt:

        lea     rax, [rbp-32]
        mov     rdi, rax
        call    getMyStruct(MyStruct const&)

No more copy construction happens. That means that the CPU has less work to do and our program is faster.
One thing to mention here is that this also lowers the memory usage of the application (since that copy is no longer done at all).

Manage short circuit evaluations correctly

There are two short circuit operators in C++: and and or (&& and ||).
The short circuit evaluation means that the second argument is executed or evaluated only if the first does not suffice to determine the value of the expression.
Take this for example:

bool isTrue()
{
	return true;
}

bool isFalse()
{
	return false;
}
int main()
{
	auto myBool = isFalse() and isTrue());
        return 0;
}

In this example, only the isFalse() function is called (or evaluated) and the isTrue() is not called. This is because of boolean logic in C++ and other languages: false and anything will still be false, so there is no need to check anything after the false.

Now, to get to my point. If you have to evaluate, let’s say two functions and you need that both be true then you should put the one that does less processing first in the evaluation. Another way would be that you put first the one that is less likely to be true, first.
For example, if you have function “doProcessing()” that takes around 4s to evaluate and you have another function “doLessProcessing()” that takes 1s to evaluate, your conditions should be like : doLessProcessing() and doProcessing() and not the other way around.
In this way, if the first one is false, the second one won’t be evaluated.

Compile time evaluations

Move whatever evaluation you can at the compile time.

From C++11, the constexpr keyword has been introduced. Use constexpr whenever possible because, this keyword tells the compiler to evaluate the “expression” at compilation time and also to apply the const qualifier.

Instead of declaring

const int MyGlobalInt = 5;

you should declare

constexpr int MyGlobalInt = 5;

This will speed up the run time since the evaluation is done at compile time, not at run time.

In general, you should search for possible ways to move the run time evaluations to the compile time.

Inline functions

Consider using inline functions when possible and you have to do the evaluation at run time. The inline keyword tells the compiler that, wherever the inline function is called, to just add the code from the function body there, instead of calling the actual function. That means that there is no more overhead in this case, which speeds up the evaluation.

For example, we can take the same sum function from above.

int sum(int x, int y)
{
    return x + y;
}

Better to have:

inline int sum(int x, int y)
{
    return x + y;
}

Please note that the inline keyword is just a suggestion to the compiler and may not always exactly be made “inline”.

Leave a Reply

%d bloggers like this: