I want to filter some data (100s of MB to a few GBs). The user can change the filters so I have to re-filter the data frequently.
The code is rather simple:
std::vector<Event*> filteredEvents;
for (size_t i = 0; i < events.size(); i++){
const auto ev = events[i];
for (const auto& filter : filters) {
if (filter->evaluate(ev)) {
filteredEvents->push_back(ev);
break;
}
}
if (i % 512 == 0) {
updateProgress(i);
}
}
I now want to add another filter. I can do that either in a way that uses more CPU or that uses more memory. To decide between the two I would like to know what the bottleneck of the above loop is.
How do I profile the code to decide if the bottleneck is the cpu or the memory?
In case it matters, the project is written in Qt and I use Qt Creator as the idea. The platform is Windows. I am currently using Very Sleepy to profile my code.
std::vector#push_back
will resize (copy) when it gets full. This can hurt iffilteredEvents
gets large. 2) For every element you need to follow theEvent
pointer. If events are not located contiguously in memory that is (very probably) a cache miss every time. 3) I'm guessingfilter::evaluate
is a virtual function? If you have more than a few filter types your branch predictor will be very unhappy. – Rudbeckiaconst auto &ev
. You don't have to allocate new memory for this variable, unless you are working on the events with other thread, which in this case means that you have no critical area protection. – Gisellegish