I'm looking for a small C library to handle utf8 strings.
Specifically, splitting based on unicode delimiters for use with stemming algorithms.
Related posts have suggested:
ICU http://www.icu-project.org/ (I found it too bulky for my purposes on embedded devices)
UTF8-CPP: http://utfcpp.sourceforge.net/ (Excellent, but C++ not C)
Has anyone found any platform independent, small codebase libraries for handling unicode strings (doesn't need to do naturalisation).