I need to save content that containing newlines in some XML attributes, not text. The method should be picked so that I am able to decode it in XSLT 1.0/ESXLT/XSLT 2.0
What is the best encoding method?
Please suggest/give some ideas.
I need to save content that containing newlines in some XML attributes, not text. The method should be picked so that I am able to decode it in XSLT 1.0/ESXLT/XSLT 2.0
What is the best encoding method?
Please suggest/give some ideas.
In a compliant DOM API there is nothing you need to do. Simply save actual newline characters to the attribute, the API will encode them correctly on its own (see Canonical XML spec, section 5.2).
If you do your own encoding (i.e. replacing \n
with
before saving the attribute value), the API will encode your input again, resulting in 

in the XML file.
Bottom line is, the string value is saved verbatim. You get out what you put in, no need to interfere.
However… some implementations are not compliant. For example, they will encode &
characters in attribute values, but forget about newline characters or tabs. This puts you in a losing position since you can't simply replace newlines with
beforehand.
These implementations will save newline characters unencoded, like this:
<xml attribute="line 1
line 2" />
Upon parsing such a document, literal newlines in attributes are normalized into a single space (again, in accordance to the spec) - and thus they are lost.
Saving (and retaining!) newlines in attributes is impossible in these implementations.
a="
"
will not be affected by this rule - it does not contain actual CR or LF characters, only their references. After parsing, a CRLF sequence will be in the attribute value. And if you save a CRLF to an attribute value it should be serialized as 

again, unless I'm misinterpreting it. –
Skater <
is valid in an attribute. Just not as a literal character. Read my answer again. It's all in there, really. :) –
Skater &xA;
). Preserving the newline is fine. Preserving it unencoded is not. –
Skater You can use the entity
to represent a newline in an XML attribute.
can be used to represent a carriage return. A windows style CRLF could be represented as
.
This is legal XML syntax. See XML spec for more details.
getAttribute()
method you speak of? –
Mantra getAttribute
won't decode
and convert it to a newline? It should work. Did you test it? –
Mantra A crude answer can be:
XmlDocument xDoc = new XmlDocument();
xDoc.Load(@"Agenda.xml");
//make stuff with the xml
//make attributes value = "\r\n" (you need both expressions to make a new line)
string a = xDoc.InnerXml.Replace("
", "\r").Replace("
", "\n").Replace("><",">\r \n<");
StreamWriter sDoc = new StreamWriter(@"Agenda.xml");
sDoc.Write(a);
sDoc.Flush();
sDoc.Dispose();
This will as you see is just a string
A slightly different approach that has been helpful in some situations-
Placeholders and Find & Replace.
Before parsing you can simply use your own custom linebreak marker/placeholder, then on the 2nd half of the situation just string replace it with whatever line break character is effective, whether that's \n or or or #&10; or \u2028 or any of the various line break characters out there. Find & replace them back in after setting the placeholder of your own in the data initially.
This is useful when parsers like jQuery $.parseXML() strip the unencoded line breaks. For example, you could use {LBREAK} as your line break char, insert it while raw text, and replace it later after parsed to an XML object. String.replaceAll() is a helpful prototype.
So rough code concept with jquery and a replaceAll prototype (have not tested this code but it will show the concept):
function onXMLHandleLineBreaks(_result){
var lineBreakCharacterThatGetsLost = ' ';
var lineBreakCharacterThatGetsLost = '
';
var rawXMLText = _result; // hold as text only until line breaks are ready
rawXMLText = String(rawXMLText).replaceAll(lineBreakCharacterThatGetsLost, '{mylinebreakmarker}'); // placemark the linebreaks with a regex find and replace proto
var xmlObj = $.parseXML(rawXML); // to xml obj
$(xmlObj).html( String(xmlObj.html()).replaceAll('{mylinebreakmarker}'), lineBreakCharacterThatWorks ); // add back in line breaks
console.log('xml with linebreaks that work: ' + xmlObj);
}
And of course you could adjust the line break chars that work or don't work to your data situation, and you could put that in a loop for a set of line break characters that don't work and iterate through them to do a an entire set of linebreak characters.
© 2022 - 2024 — McMap. All rights reserved.