What's the difference between glib gunichar and wchar_t and which is better for cross-platform solutions?
Asked Answered
N

1

5

I'm trying to write some C code which is portable only so far as the user has gcc, and has glib installed.

From all my research, I've found that with gcc, a wchar_t is always defined as 4 bytes, and with glib a gunichar is also 4 bytes.

What I haven't figured out is if like a gunichar, a wchar_t is encoded as UCS4 as well. Is this the case? If so, I should be able to simply cast a gunichar* to a wchar_t* and use the stdc wcs* functions, right?

Nonaggression answered 24/3, 2012 at 9:24 Comment(0)
P
9

If you use GLib, don't use wchar_t. Use its unicode support, it's a lot better than the C standard library's support.

wchar_t is 4 bytes on Linux and Mac OS (and a few others), not on Windows (it's 2 bytes there) and some others. Portable code means avoiding wchar_t like the plague.

Probe answered 24/3, 2012 at 10:12 Comment(3)
hmm. thanks. i just noticed that almost all of the glib unicode functions operate on utf8 strings, and from what I understand (could be wrong) iterating through a multi-byte encoded char array is inefficient as you need to use an iterator to make sure you get a full char and simply not a byte (can't simply i++ you way through the array). I just now re-checked the docs and g_utf8_next_char() is implemented as a macro, so I guess it's not so much of an issue to me anymore. thanks again.Nonaggression
@skot decent unicode support is costly, any way you put it. Worry about performance once your program/library works, not before.Probe
good point. things seem to be working well so far, so I'll cross that bridge if/when I need to. thanks again.Nonaggression

© 2022 - 2024 — McMap. All rights reserved.