What XML version to use?
Asked Answered
V

3

22

I have an online shop where vendors can upload and import there articles in two formats.

  1. plain text (tab delimted)
  2. XML

Currently I'm using XML 1.0.

However I see there is also a version 1.1

At wikipedia it is stated that for most uses 1.0 will be OK to use. http://en.wikipedia.org/wiki/XML#Versions

It also states it uses the following Unicode encoding: Unicode 2.0 to Unicode 3.2.

In the fifth edition, XML names may contain characters in the Balinese, Cham, or Phoenician scripts among many others which have been added to Unicode since Unicode 3.2

Currently I only have a couple of 'latin' based languages but this may change in the future and I want to be prepared.

Are there any characters in Unicode 3.2 not supported for some languages? Is v1.0 safe to use for me?

If you need more info just let me know.

Venosity answered 30/7, 2011 at 12:24 Comment(0)
J
26

Use version 1.0.

You would only need to use version 1.1 if you are using certain non-ASCII characters in identifiers, EBCDIC line ending characters, or control characters (character codes 1 - 31).

Rationale and list of changes for XML 1.1

Josselyn answered 30/7, 2011 at 12:34 Comment(4)
I understand the benefits are minor, but what are the disadvantages to specifying <?xml version="1.1"?>? will many things suddenly not work?Usa
@ycomp: Yes, it can very well stop working. The support for XML 1.1 is not widely implemented. The .NET framework for example won't read it.Josselyn
I don't understand why they had to change the version number suddenly. The first (XML 1.0) was initially defined in 1998. It has undergone several revisions since then, without being given a new version number.Underset
@Underset "A new XML version, rather than a set of errata to XML 1.0, is being created because the changes affect the definition of well-formed documents." see the quoted rationale in the answerKalliekallista
C
11

XML 1.1 came out of a fanatical desire to be "inclusive" by supporting all the world's languages, including methods of writing Abyssinian that were only used for 15 years nearly a century ago. If you are one of the 99.99999% of the population who doesn't need to capture ancient manuscripts, XML 1.1 is a total waste of time.

Constructivism answered 31/7, 2011 at 0:16 Comment(12)
I think I fall in the category of the 99,99999 :)Venosity
Note that this also only applies to identifiers. You can still use those characters in content.Josselyn
Only 713 people need it?Defrayal
I suspect the number of people who need to use obsolete Abyssinian characters in the names of elements and attributes is a lot lower than 713.Constructivism
There are some constructions, like conditional elements and assertions, that don't exist in 1.0. So, for many, it isn't a total waste of time.Etymon
@Etymon you are confusing XSD 1.1 with XML 1.1. This thread is about XML versions, not XSD versions.Constructivism
It's not so fanatical. 1.0 didn't allow many control characters in content, not even if you escaped them, &#x01; in your xml file means it's not "well formed". Well thank you very much. I need to talk about that character. Does that make me a "fanatic"? There is no need to exclude characters from content just because most users don't need them.Offenbach
If you need to use x01 then you probably also need to use x00, which XML 1.1 doesn't allow; so you need to find a way of talking about control characters that doesn't involve putting them literally in your content.Constructivism
Can you please elaborate on a workaround so that users could use characters normally not "allowed" in their xml content?? (perhaps even update your answer) @MichaelKay I was thinking you use html or other text format outside the xml structure.Steelwork
@JonGrah Asking supplementary questions in comments isn't a good idea, especially if it's 10 years since the original post. Please raise a new question.Constructivism
XML 1.0 forbids control characters in text content, not only in identifiers!Stylography
@JonGrah the only solution is to introduce additional escaping, on top of XML.Stylography
A
9

Beyond non-useful things (like silly EBCDIC linefeeds), there is unfortunately one nice feature that XML 1.1 allows: ability to use character entities for Unicode/ASCII control characters other than LF/CR/Tab. Except that you still can not include nulls, even using character references.

So this is hardly enough to make one use 1.1, unless there is specific need to contain these characters.

Abm answered 1/8, 2011 at 3:37 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.