CPU huffman compression faster after first execution?

About

Asked 30/4, 2012 at 4:48 Answered 30/4, 2012 at 4:59

I've recently constructed a CPU implementation of Huffman encoding in C++. I've also constructed a GPU version in CUDA in order to compare times, but I've come across a problem when testing the CPU's times:

When stress testing by compressing large files, for instance a 97mb text file with almost every letter in the alphabet and various other ascii characters, my CPU implementation will take approximately 8.3 seconds the first time it executes. After that, the time drops significantly to 1.7 seconds. NOTE: I'm only timing the CPU's counting of the frequency, not the encoding of the string and writing to a file.

Any ideas how this could be? I'm closing all file pointers and shouldn't be caching anything as far as I know.

Let me know if any source code is needed, thanks.

Toplofty answered 30/4, 2012 at 4:48 Comment(0)

After the first run the file content is cached by the system (and is shared by all processes), so the next run you are actually reading the file from memory.

Umberto answered 30/4, 2012 at 4:59 Comment(5)

So should I use the timings when the file is first run, or when it's cached by the system and read from memory? If the first, how would I ensure that I'm doing the run and it's not being read from memory? – Toplofty 30/4, 2012 at 5:27

@superXero: I have a program on my system rougly like 'try{while(true) new char[1000];} catch(...) {return 0;}' which will make the system flush all the buffers it possibly can. Normally people just use the second timing though. It's easier and more reliable. – Lilt 30/4, 2012 at 6:0

I guess what you really want is the time used by your algorithm without the time used by IO operations. So if I were you I will read the whole file into memory, record the start time, run the algorithm with the file content in the memory, and then record the end time. – Umberto 30/4, 2012 at 6:6

Thanks, will use the second timing, where I'm reading the file from memory. – Toplofty 30/4, 2012 at 6:35

It may not only be memory cache. CPU caches might be at work here. It is better to look for a way to flush all caches ... Or restart computer each time. Or make a slave -server computer to run algorithm and restart automatically and wait noxt job. – Fete 12/6, 2020 at 8:45

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags