I find it hard to believe I'm the first person to run into this problem but searched for quite some time and didn't find a solution to this.
I'd like to use strncpy
but have it be UTF8 aware so it doesn't partially write a utf8 code-point into the destination string.
Otherwise you can never be sure that the resulting string is valid UTF8, even if you know the source is (when the source string is larger than the max length).
Validating the resulting string can work but if this is to be called a lot it would be better to have a strncpy function that checks for it.
glib has g_utf8_strncpy
but this copies a certain number of unicode chars, whereas Im looking for a copy function that limits by the byte length.
To be clear, by "utf8 aware", I mean that it should not exceed the limit of the destination buffer and it must never copy only part of a utf-8 code-point. (Given valid utf-8 input must never result in having invalid utf-8 output).
Note:
Some replies have pointed out that strncpy
nulls all bytes and that it wont ensure zero termination, in retrospect I should have asked for a utf8 aware strlcpy
, however at the time I didn't know of the existence of this function.
strncpy
doesn't guarantee to result in a zero ended C string as result either. Contrary to wide spread belief, strncpy is not a "string" function, but a buffer handling function. The 2 often forgotten side effects of it give a clue about that (the 2nd side effect of it is the nulling of the buffer in the size given). – Turne