How to derive DTD (or other XML spec format) from XML file samples
Asked Answered
O

9

12

Do you know of a tool that will derive a DTD (or other XML structure specification format) from a sample set of XML files?

Currently the only (automatic) validation we have for an xml encoded DSL is a legacy parser written in Perl, but for consistency reasons all perl code must be ported to C-sharp.

Odalisque answered 29/11, 2009 at 10:43 Comment(0)
L
7

http://www.stylusstudio.com/dtd_generator.html is actual software implementing a DTD generator.

http://www.pmg.csail.mit.edu/~chmoh/pubs/wecwis.pdf seems like a nice paper on the kind of thing you'd need, but I can't find (links to) actual code anywhere in the paper so far.

Here's another paper on this, again, no code to be found: http://www.softnet.tuc.gr/~minos/Papers/debull03.pdf.

Finally, I'd also suggest you look into using RELAX NG or Schematron to validate your XML instead. Those languages are much more expressive, making them easier to read and more powerful in the kinds of things you can validate. (Be sure to skip XML Schema, which is widely considered to be a mess.)

Lizliza answered 29/11, 2009 at 10:52 Comment(3)
Interesting document! Too bad they never released the code (in a way that I can find :-)Odalisque
hitsw.com/xml_utilites promises to create a DTD/schema from a single XML file ..Odalisque
RE - RELAX NG or Schematron: I'll see if validators exist for the dotNet environment. thanks for the tips!Odalisque
E
8

You can use xsd.exe (part of visual studio) to generate an XML schema for a given XML file.

Elohist answered 1/12, 2009 at 16:22 Comment(0)
L
7

http://www.stylusstudio.com/dtd_generator.html is actual software implementing a DTD generator.

http://www.pmg.csail.mit.edu/~chmoh/pubs/wecwis.pdf seems like a nice paper on the kind of thing you'd need, but I can't find (links to) actual code anywhere in the paper so far.

Here's another paper on this, again, no code to be found: http://www.softnet.tuc.gr/~minos/Papers/debull03.pdf.

Finally, I'd also suggest you look into using RELAX NG or Schematron to validate your XML instead. Those languages are much more expressive, making them easier to read and more powerful in the kinds of things you can validate. (Be sure to skip XML Schema, which is widely considered to be a mess.)

Lizliza answered 29/11, 2009 at 10:52 Comment(3)
Interesting document! Too bad they never released the code (in a way that I can find :-)Odalisque
hitsw.com/xml_utilites promises to create a DTD/schema from a single XML file ..Odalisque
RE - RELAX NG or Schematron: I'll see if validators exist for the dotNet environment. thanks for the tips!Odalisque
U
4

You can use the following link for generating schema online, by providing just the xml data. http://www.xmlforasp.net/codebank/system_xml_schema/buildschema/buildxmlschema.aspx

Undermanned answered 18/9, 2013 at 6:58 Comment(0)
M
4

You can download JetBrains IDEA community edition which is free. It has built-in tools for generating GTDs and Schemas:

http://www.jetbrains.com/idea/webhelp/generating-dtd.html

Maybe not perfect but it is something.

Mastic answered 12/1, 2014 at 15:36 Comment(0)
M
3

Here is the program that worked for me DTDGenerator. You need to compile it with Java, but it works well. I am surprised by the lack of free software for a language that has been around for a long time, but this one is free under Mozilla Public License Version 1.0.

Moeller answered 6/8, 2013 at 22:56 Comment(2)
This question is quite old, but still seems to be relevant. An reply with current suggestions is welcome.Odalisque
Thank you so much... really, there's a lack of free software! thanks for your help!! :)Hent
C
1

Altova's XMLSpy has a DTD/XML Schema generator.

The generated DTD/XML Schema usually requires a little tweaking. For example, the tool may enumerate a list of attributes or elements, when you "meant" for it to allow any value. You're only giving it a sample of your problem space, and it has to go from specific to general, though. For that reason, I don't get too bent out of shape when it fails to read my mind.

I consider the generated dtd or schema a starting point. It's better than rolling it by hand from zero. Er, if you're starting with existing XML documents, that is.

Even if you're not going to use the generated dtd, it's a pretty good way to get your head around the structure of a set of unfamiliar XML documents.

Clements answered 29/11, 2009 at 12:55 Comment(0)
F
1

XMLMax editor will create an XSD from an XML file. The free trial(no registraton/small download file) will do this for you. If you want to do this in code, .NET framework has an XmlSchemaInference class that automatically creates an XSD from an xml file.

Fungible answered 30/11, 2009 at 0:29 Comment(0)
C
1

Just used http://www.freeformatter.com/xsd-generator.html to generate an xsd from an xml file. It also has a lot of other formatting possibilities!

Cheekbone answered 29/7, 2015 at 7:39 Comment(0)
N
0

You may want to try Trang or Instance to Schema Tool (part of XMLBeans).

I put them into a test with 1GB XML file. Here are the results:

Trang:

max memory [kB] - 98,480
time [MM:SS] - 0:24

Instance to Schema Tool:

max memory [kB] - 5,993,240
time [MM:SS] - 7:36
Nidus answered 4/5, 2012 at 15:57 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.