I am trying to implement two classes A and B which contain data stored in std::unique_ptr container, and A could transform to B with some calculation. The class A and class B are shown as below. The input parameter of constructor in class B is designed passing a object A. In order to measure the run-time performance, I use std::chrono
library, and add a function "print_construction_time" in class B to show construction time.
CPU: Intel® Core™ i7-6700HQ 2.6GHz
RAM: 16GB
OS: Windows 10 1909
IDE: Microsoft Visual Studio Community 2019 Version 16.4.4
c/c++ optimization setting: Maximum Optimization (Favor Speed) (/O2)
class A
{
public:
A(int input_size, int input_value) // constructor
{
this->data = std::make_unique<int[]>(input_size);
this->size = input_size;
for (int loop_number = 0; loop_number < size; loop_number++) {
data[loop_number] = input_value;
}
}
std::unique_ptr<int[]> get_data()
{
// deep copy
auto return_data = std::make_unique<int[]>(size);
for (int loop_number = 0; loop_number < size; loop_number++) {
return_data[loop_number] = data[loop_number];
}
return return_data;
}
int get_size()
{
return this->size;
}
private:
int size;
std::unique_ptr<int[]> data;
};
class B
{
public:
B(A &input_object) // constructor
{
this->size = input_object.get_size();
this->data = std::make_unique<int[]>(this->size);
this->start = std::chrono::high_resolution_clock::now();
// version 1
for (int loop_number = 0; loop_number < input_object.get_size(); loop_number++) {
this->data[loop_number] = transform_from_A(input_object.get_data()[loop_number]);
}
this->stop = std::chrono::high_resolution_clock::now(); // for execution time measurement
}
std::unique_ptr<int[]> get_data()
{
// deep copy
auto return_data = std::make_unique<int[]>(size);
for (int loop_number = 0; loop_number < size; loop_number++) {
return_data[loop_number] = data[loop_number];
}
return return_data;
}
void print_construction_time()
{
auto duration = std::chrono::duration_cast<std::chrono::microseconds>(stop - start);
std::cout << "Duration: " << duration.count() << "microseconds" << std::endl;
}
private:
int size;
std::unique_ptr<int[]> data;
std::chrono::time_point<std::chrono::steady_clock> start, stop;
int transform_from_A(int input_value)
{
return input_value + 1; // For example
}
};
The main function is here.
int main()
{
A a_object(10000, 6);
B b_object(a_object);
b_object.print_construction_time();
return 0;
}
For testing, I run this code three times and the execution time is 123407us, 112033us and 107586us. Next, I modify the for loop block in the constructor of class B into version 2 which designed with cached data.
// version 2 <= tremendously faster than version 1
auto data_cached = input_object.get_data();
for (int loop_number = 0; loop_number < input_object.get_size(); loop_number++) {
this->data[loop_number] = transform_from_A(data_cached[loop_number]);
}
The measurement result of version 2 is 27us, 32us and 43us. I am curious that why the compiler seems not perform the Maximum Optimization in speed automatically through cache tricks based on the condition of the same calculation result and it need to be done by human. I think that this kind of optimization maybe could be done either by compiler automatically or by popping up the modify suggestion in editor environment.
Note: I also modify the optimization setting for testing. The run-time performance is similar between optimization setting in Maximum Optimization (Favor Speed) (/O2) and Optimizations (Favor Speed) (/Ox).
Feb. 27. 2020 Update
I also tried to modify the return type of A::get_data() in class A with adding const
keyword of std::unique_ptr.
const std::unique_ptr<int[]> get_data() // <= Add const keyword of std::unique_ptr
{
// deep copy
auto return_data = std::make_unique<int[]>(size);
for (int loop_number = 0; loop_number < size; loop_number++) {
return_data[loop_number] = data[loop_number];
}
return return_data;
}
The measurement result of execution time is as similar as above. They are 101486us, 100538us and 120620us in version 1 and 55us, 30us and 27us in version 2. The optimization setting is Optimizations (Favor Speed) (/Ox) and IDE is updated to Version 16.4.5 .
Moreover, the case of "both const" (not only std::unique_ptr but its content), that is const std::unique_ptr<const int[]>
also be considered.
const std::unique_ptr<const int[]> get_data()
{
// deep copy
auto return_data = std::make_unique<int[]>(size);
for (int loop_number = 0; loop_number < size; loop_number++) {
return_data[loop_number] = data[loop_number];
}
return return_data;
}
The measurement result of execution time is also as similar as above. They are 114754us, 127327us and 106122us in version 1 and 32us, 34us, and 44us in version 2. The optimization setting is also Optimizations (Favor Speed) (/Ox).
A::get_data
? – SoilureA::get_data
is needed or not. – Origanstd::chrono::time_point<std::chrono::steady_clock> start, stop;
defines withsteady_clock
butthis->start = high_resolution_clock::now();
assignshigh_resolution_clock
. Looks likehigh_resolution_clock
issteady_clock
in the MSVC Standard library. – Soilure