C++ so far (unfortunately) doesn't support finally
clause for a try
statement. This leads to speculations on how to release resources. After studying the question on the internet, although I found some solutions, I didn't get clear about their performance (and I would use Java if performance didn't matter that much). So I had to benchmark.
The options are:
Functor-based
finally
class proposed at CodeProject. It's powerful, but slow. And the disassembly suggests that outer function local variables are captured very inefficiently: pushed to the stack one by one, rather than passing just the frame pointer to the inner (lambda) function.RAII: Manual cleaner object on the stack: the disadvantage is manual typing and tailoring it for each place used. Another disadvantage is the need to copy to it all the variables needed for resource release.
MSVC++ specific
__try
/__finally
statement. The disadvantage is that it's obviously not portable.
I created this small benchmark to compare the runtime performance of these approaches:
#include <chrono>
#include <functional>
#include <cstdio>
class Finally1 {
std::function<void(void)> _functor;
public:
Finally1(const std::function<void(void)> &functor) : _functor(functor) {}
~Finally1() {
_functor();
}
};
void BenchmarkFunctor() {
volatile int64_t var = 0;
const int64_t nIterations = 234567890;
auto start = std::chrono::high_resolution_clock::now();
for (int64_t i = 0; i < nIterations; i++) {
Finally1 doFinally([&] {
var++;
});
}
auto elapsed = std::chrono::high_resolution_clock::now() - start;
double nSec = 1e-6 * std::chrono::duration_cast<std::chrono::microseconds>(elapsed).count();
printf("Functor: %.3lf Ops/sec, var=%lld\n", nIterations / nSec, (long long)var);
}
void BenchmarkObject() {
volatile int64_t var = 0;
const int64_t nIterations = 234567890;
auto start = std::chrono::high_resolution_clock::now();
for (int64_t i = 0; i < nIterations; i++) {
class Cleaner {
volatile int64_t* _pVar;
public:
Cleaner(volatile int64_t& var) : _pVar(&var) { }
~Cleaner() { (*_pVar)++; }
} c(var);
}
auto elapsed = std::chrono::high_resolution_clock::now() - start;
double nSec = 1e-6 * std::chrono::duration_cast<std::chrono::microseconds>(elapsed).count();
printf("Object: %.3lf Ops/sec, var=%lld\n", nIterations / nSec, (long long)var);
}
void BenchmarkMSVCpp() {
volatile int64_t var = 0;
const int64_t nIterations = 234567890;
auto start = std::chrono::high_resolution_clock::now();
for (int64_t i = 0; i < nIterations; i++) {
__try {
}
__finally {
var++;
}
}
auto elapsed = std::chrono::high_resolution_clock::now() - start;
double nSec = 1e-6 * std::chrono::duration_cast<std::chrono::microseconds>(elapsed).count();
printf("__finally: %.3lf Ops/sec, var=%lld\n", nIterations / nSec, (long long)var);
}
template <typename Func> class Finally4 {
Func f;
public:
Finally4(Func&& func) : f(std::forward<Func>(func)) {}
~Finally4() { f(); }
};
template <typename F> Finally4<F> MakeFinally4(F&& f) {
return Finally4<F>(std::forward<F>(f));
}
void BenchmarkTemplate() {
volatile int64_t var = 0;
const int64_t nIterations = 234567890;
auto start = std::chrono::high_resolution_clock::now();
for (int64_t i = 0; i < nIterations; i++) {
auto doFinally = MakeFinally4([&] { var++; });
//Finally4 doFinally{ [&] { var++; } };
}
auto elapsed = std::chrono::high_resolution_clock::now() - start;
double nSec = 1e-6 * std::chrono::duration_cast<std::chrono::microseconds>(elapsed).count();
printf("Template: %.3lf Ops/sec, var=%lld\n", nIterations / nSec, (long long)var);
}
void BenchmarkEmpty() {
volatile int64_t var = 0;
const int64_t nIterations = 234567890;
auto start = std::chrono::high_resolution_clock::now();
for (int64_t i = 0; i < nIterations; i++) {
var++;
}
auto elapsed = std::chrono::high_resolution_clock::now() - start;
double nSec = 1e-6 * std::chrono::duration_cast<std::chrono::microseconds>(elapsed).count();
printf("Empty: %.3lf Ops/sec, var=%lld\n", nIterations / nSec, (long long)var);
}
int __cdecl main() {
BenchmarkFunctor();
BenchmarkObject();
BenchmarkMSVCpp();
BenchmarkTemplate();
BenchmarkEmpty();
return 0;
}
The results on my Ryzen 1800X @3.9Ghz with DDR4 @2.6Ghz CL13 were:
Functor: 175148825.946 Ops/sec, var=234567890
Object: 553446751.181 Ops/sec, var=234567890
__finally: 553832236.221 Ops/sec, var=234567890
Template: 554964345.876 Ops/sec, var=234567890
Empty: 554468478.903 Ops/sec, var=234567890
Apparently, all the options except functor-base (#1) are as fast as an empty loop.
So is there a fast and powerful C++ alternative to finally
, which is portable and requires minimum copying from the stack of the outer function?
UPDATE: I've benchmarked @Jarod42 solution, so here in the question is updated code and output. Though as mentioned by @Sopel, it may break if copy elision is not performed.
UPDATE2: To clarify what I'm asking for is a convenient fast way in C++ to execute a block of code even if an exception is thrown. For the reasons mentioned in the question, some ways are slow or inconvenient.
BenchmarkObject()
. I've listed its disadvantages: mainly that it takes substantial memory on the stack and requires copying from the stack of the outer function. – Witcheryfinally
clause could be that exceptions in C++ are expensive when thrown, and therefore should only be used for truly exceptional cases. That of course leads totry-catch
blocks being uncommon, and mostly used to do some error reporting and then rethrowing the exception so the application terminates. Which means there's really no use for afinally
clause. This is unlike other languages where exceptions are the normal error-handling function. – Laforgefinally
". – Goldyfinally
RAII but make all the types you use RAII. Like lets say you're usingint foo* = new int[some_num]; int bar* = new[some_num];
we can replace that withstd::unique_ptr<int[]>
and if an exception is raised they will be cleaned up automatically. You don't have to do anything. All it cost is a destructor, which often times is minimal if anything. – Idaliafinally
, as in{ SomeType a; try { ... } catch { ... } /* automatic "finally" of variable a */ }
– LaforgeCleaner
does not need here a lot of variables from the outer function. But in practice it needs several: at least the array pointer and the number of objects in the array, so to call their destructors explicitly before returning the memory asvoid*
to a memory pool. And I need multiple cleaners for multiple resources allocated at different stages within a function. – Witcheryfinally
handles really resource, some might restore state, and creating RAII class for each such case would just repeat a pattern which can be factorized byfinally
. – Reduction