I don't seem to understand the purpose of XMLString::transcode(XMLCh*)
and XMLString::transcode(char*)
, because obviously I don't understand the difference between XMLCh*
and char*
.
Can someone please make things clearer for me ?
Xerces encodes information as UTF-16 internally. The UTF-16 data is stored using the XMLCh
datatype.
'C-style' strings use char
which is in the local code page (probably UTF-8, but it depends on the platform and settings) You use transcode
to convert between the two.
For instance if you want to feed some data from Xerces to another library and that library expects text in the local code page, you need to transcode
it. Also, if you have char
data and want to feed it to Xerces, you need to transcode
it to XMLCh
, because that is what Xerces understands.
For example:
// to local code page
DOMNode *node = ...;
char* temp = XMLString::transcode(node->getNodeValue());
std::string value(temp);
XMLString::release(&temp);
// from local code page
DOMElement *element = ...;
XMLCh *tag = XMLString::transcode("test");
DOMNodeList *list = element->getElementsByTagName(tag);
XMLString::release(&tag);
Do not forget to release the string! Better is to write some wrapper round it but there are examples available on the internet (just search for a class named XercesString
).
If you want to know more about encodings I think you should read The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)
Kind of nevermind, since I had mistakenly tried transcode on a char* which had been merely cast to XMLCh*. In that case transcode failed but the below code succeeded. As given below, xmlch_abc represents a (XMLCh*)char* . This answer given in case someone else finds themselves in the same unusual situation, program crashing caused by pilot error.
TranscodeToStr tts(xmlch_abc,"utf-16");
const unsigned char * chstr = tts.str();
std::cout<<chstr<<std::endl;
© 2022 - 2024 — McMap. All rights reserved.