I've recently constructed a CPU implementation of Huffman encoding in C++. I've also constructed a GPU version in CUDA in order to compare times, but I've come across a problem when testing the CPU's times:
When stress testing by compressing large files, for instance a 97mb text file with almost every letter in the alphabet and various other ascii characters, my CPU implementation will take approximately 8.3 seconds the first time it executes. After that, the time drops significantly to 1.7 seconds. NOTE: I'm only timing the CPU's counting of the frequency, not the encoding of the string and writing to a file.
Any ideas how this could be? I'm closing all file pointers and shouldn't be caching anything as far as I know.
Let me know if any source code is needed, thanks.