I have sorted lists of filenames concatenated to strings and want to identify each such string by a unique checksum.
The size of these strings is a minimum of 100 bytes, a maximum of 4000 bytes, and an average of 1000 bytes. The total number of strings could be anything but more likely be in the range of ca. 10000.
Is CRC-32 suited for this purpose?
E.g. I need each of the following strings to have a different fixed-length (, preferably short,) checksum:
"/some/path/to/something/some/other/path"
"/some/path/to/something/another/path"
"/some/path"
...
# these strings can get __very__ long (very long strings are the norm)
Is the uniqueness of CRC-32 hashes increased by input length?
Is there a better choice of checksum for this purpose?