A possible alternative is to use the BSD function strsep()
instead of strtok()
, if available.
From the man page:
The strsep()
function is intended as a replacement for the strtok()
function. While the strtok()
function should be preferred for
portability reasons (it conforms to ISO/IEC 9899:1990 ("ISO C90"))
it is unable to handle empty fields, i.e., detect fields delimited by
two adjacent delimiter characters, or to be used for more than a
single string at a time. The strsep()
function first appeared in
4.4BSD.
A simple example (also copied from that man page):
char *token, *string, *tofree;
tofree = string = strdup("value;;test;etc");
while ((token = strsep(&string, ";")) != NULL)
printf("token=%s\n", token);
free(tofree);
Output:
token=value
token=
token=test
token=etc
so empty fields are handled correctly.
Of course, as others already said, none of these simple tokenizer functions handles
delimiter inside quotation marks correctly, so if that is an issue, you should use
a proper CSV parsing library.
strsep()
available on your platform? The usage is very similar tostrtok()
, but it returns empty fields correctly. – Chitinaaa;bbb;"ddd;eee";fff
correctly. – Chitinstrsep()
. – Periwig