Your issue has a resolution, but it will not be pretty. Here's why:
Violation of non-deterministic content models
You've touched on the very soul of W3C XML Schema's. What you are asking — variable order and variable unknown elements — violates the hardest, yet most basic principle of XSD's, the rule of Non-Ambiguity, or, more formally, the Unique Particle Attribution Constraint:
A content model must be formed such
that during validation [..] each item
in the sequence can be uniquely
determined without examining the
content or attributes of that item,
and without any information about the
items in the remainder of the
sequence.
In normal English: when an XML is validated and the XSD processor encounters <SurName>
it must be able to validate it without first checking whether it is followed by <GivenName>
, i.e., no looking forward. In your scenario, this is not possible. This rule exists to allow implementations through Finite State Machines, which should make implementations rather trivial and fast.
This is one of the most-debated issues and is a heritage of SGML and DTD (content models must be deterministic) and XML, that defines, by default, that the order of elements is important (thus, trying the opposite, making the order unimportant, is hard).
As Marc_s already suggested, Relax_NG is an alternative that allows for non-deterministic content models. But what can you do if you're stuck with W3C XML Schema?
Non-working semi-valid solutions
You've already noticed that xs:all
is very restrictive. The reason is simple: the same non-deterministic rule applies and that's why xs:any
, min/maxOccurs
larger then one and sequences are not allowed.
Also, you may have tried all sorts of combinations of choice
, sequence
and any
. The error that the Microsoft XSD processor throws when encountering such invalid situation is:
Error: Multiple definition of element
'http://example.com/Chad:SurName'
causes the content model to become
ambiguous. A content model must be
formed such that during validation of
an element information item sequence,
the particle contained directly,
indirectly or implicitly therein with
which to attempt to validate each item
in the sequence in turn can be
uniquely determined without examining
the content or attributes of that
item, and without any information
about the items in the remainder of
the sequence.
In O'Reilly's XML Schema (yes, the book has its flaws) this is excellently explained. Furtunately, parts of the book are available online. I highly recommend you read through section 7.4.1.3 about the Unique Particle Attribution Rule, their explanations and examples are much clearer than I can ever get them.
One working solution
In most cases it is possible to go from an undeterministic design to a deterministic design. This usually doesn't look pretty, but it's a solution if you have to stick with W3C XML Schema and/or if you absolutely must allow non-strict rules to your XML. The nightmare with your situation is that you want to enforce one thing (2 predefined elements) and at the same time want to have it very loose (order doesn't matter and anything can go between, before and after). If I don't try to give you good advice but just take you directly to a solution, it will look as follows:
<xs:element name="User">
<xs:complexType>
<xs:sequence>
<xs:any minOccurs="0" processContents="lax" namespace="##other" />
<xs:choice>
<xs:sequence>
<xs:element name="GivenName" />
<xs:any minOccurs="0" processContents="lax" namespace="##other" />
<xs:element name="SurName" />
</xs:sequence>
<xs:sequence>
<xs:element name="SurName" />
<xs:any minOccurs="0" processContents="lax" namespace="##other" />
<xs:element name="GivenName" />
</xs:sequence>
</xs:choice>
<xs:any minOccurs="0" processContents="lax" namespace="##any" />
</xs:sequence>
<xs:attribute name="ID" type="xs:unsignedByte" use="required" />
</xs:complexType>
</xs:element>
The code above actually just works. But there are a few caveats. The first is xs:any
with ##other
as its namespace. You cannot use ##any
, except for the last one, because that would allow elements like GivenName
to be used in that stead and that means that the definition of User
becomes ambiguous.
The second caveat is that if you want to use this trick with more than two or three, you'll have to write down all combinations. A maintenance nightmare. That's why I come up with the following:
A suggested solution, a variant of a Variable Content Container
Change your definition. This has the advantage of being clearer to your readers or users. It also has the advantage of becoming easier to maintain. A whole string of solutions are explained on XFront here, a less readable link you may have already seen from the post from Oleg. It's an excellent read, but most of it does not take into account that you have a minimum requirement of two elements inside the variable content container.
The current best-practice approach for your situation (which happens more often than you may imagine) is to split your data between the required and non-required fields. You can add an element <Required>
, or do the opposite, add an element <ExtendedInfo>
(or call it Properties, or OptionalData). This looks as follows:
<xs:element name="User2">
<xs:complexType>
<xs:sequence>
<xs:element name="GivenName" />
<xs:element name="SurName" />
<xs:element name="ExtendedInfo" minOccurs="0">
<xs:complexType>
<xs:sequence>
<xs:any minOccurs="0" maxOccurs="unbounded" processContents="lax" namespace="##any" />
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
This may seem less than ideal at the moment, but let it grow a bit. Having an ordered set of fixed elements isn't that big a deal. You're not the only one who'll be complaining about this apparent deficiency of W3C XML Schema, but as I said earlier, if you have to use it, you'll have to live with its limitations, or accept the burden of developing around these limitations at a higher cost of ownership.
Alternative solution
I'm sure you know this already, but the order of attributes is by default undetermined. If all your content is of simple types, you can alternatively choose to make a more abundant use of attributes.
A final word
Whatever approach you take, you will lose a lot of verifiability of your data. It's often better to allow content providers to add content types, but only when it can be verified. This you can do by switching from lax
to strict
processing and by making the types themselves stricter. But being too strict isn't good either, the right balance will depend on your ability to judge the use-cases that you're up against and weighing that in against the trade-offs of certain implementation strategies.