How to set up catalog files for xmllint?
Asked Answered
T

2

6

Ok. I want to set up catalog files for xmllint to fix things so that the dcterms xml namespace is validated from a local document. I believe that I have done everything right, but it simply doesn't seem to be working.

I am running OSX.

I have created a directory /etc/xml

$ mkdir /etc/xml
$ cd /etc/xml

I have downloaded dcterms.xsd to that directory

$ ls -l
-rw-r--r--  1 ibis  wheel  12507 24 Jul 11:42 dcterms.xsd

I have created a file named "catalog"

$ xmlcatalog --create > catalog

I have added the dcterms namespace to the catalog file

$ xmlcatalog --noout --add uri http://purl.org/dc/elements/1.1/ file:///etc/xml/dc.xsd
$ xmlcatalog --noout --add uri http://purl.org/dc/terms/ file:///etc/xml/dcterms.xsd
$ cat catalog
<?xml version="1.0"?>
<!DOCTYPE catalog PUBLIC "-//OASIS//DTD Entity Resolution XML Catalog V1.0//EN" "http://www.oasis-open.org/committees/entity/release/1.0/catalog.dtd">
<catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog">
  <uri name="http://purl.org/dc/elements/1.1/" uri="file:///etc/xml/dc.xsd"/>
  <uri name="http://purl.org/dc/terms/" uri="file:///etc/xml/dcterms.xsd"/>
</catalog>

In a work directory, I have created a simple xml schema named Empty.xsd

<?xml version="1.0" encoding="UTF-8"?>
<schema xmlns="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.example.org/Empty" xmlns:tns="http://www.example.org/Empty" elementFormDefault="qualified">
  <element name="empty">
    <complexType>
      <sequence>
        <any processContents="strict" minOccurs="0" maxOccurs="unbounded"/>
      </sequence>
      <anyAttribute></anyAttribute>
    </complexType>
  </element>
</schema>

Note that the processContents is "strict".

I have created an XML file which should trigger all the validation:

<?xml version="1.0" encoding="UTF-8"?>
<empty xmlns="http://www.example.org/Empty" 
          xmlns:dcterms="http://purl.org/dc/terms/">
    <dcterms:title>A title</dcterms:title>
</empty>

Then I attempt to validate it.

$ xmllint --noout --valid --schema Empty.xsd Empty.xml
Empty.xml:2: validity error : Validation failed: no DTD found !
y xmlns="http://www.example.org/Empty" xmlns:dcterms="http://purl.org/dc/terms/"
                                                                               ^
Empty.xml:3: element title: Schemas validity error : Element '{http://purl.org/dc/terms/}title': No matching global element declaration available, but demanded by the strict wildcard.
Empty.xml fails to validate

I have set up a catalog as specified in the docs and pointed it at the local dcterms schema file. Why does xmllint fail to find it?

Tupelo answered 24/7, 2012 at 2:41 Comment(0)
I
3

Program xmllint doesn't auto-load XSD-files based on xmlns="something" attributes found in the to-be-parsed XML-file, it only uses the XSD specified in --schema parameter (and the ones imported/included from that).

For the test, you could create a NonEmpty.xsd like this:

<?xml version="1.0" encoding="UTF-8"?>
<schema 
    xmlns="http://www.w3.org/2001/XMLSchema"
    targetNamespace="http://www.example.org/Empty"
    elementFormDefault="qualified">
  <include schemaLocation="Empty.xsd"/>
  <import schemaLocation="dcterms.xsd" namespace="http://purl.org/dc/terms/"/>
</schema>

Usage:

$ xmllint -debugent -noout -schema NonEmpty.xsd Empty.xml
new input from file: NonEmpty.xsd
new input from file: Empty.xsd
new input from file: dcterms.xsd
new input from file: http://www.w3.org/2001/03/xml.xsd
new input from file: dc.xsd
new input from file: dcmitype.xsd
new input from file: Empty.xml
Empty.xml validates

Now with catalog file:

<?xml version="1.0"?>
<!DOCTYPE catalog PUBLIC "-//OASIS//DTD Entity Resolution XML Catalog V1.0//EN" "http://www.oasis-open.org/committees/entity/release/1.0/catalog.dtd">
<catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog">
  <uri name="http://www.w3.org/2001/03/xml.xsd"          uri="file:///home/zsiga/proba/dcterms/2001_03_xml.xsd"/>
  <uri name="http://dublincore.org/schemas/xmls/qdc/dcterms.xsd" uri="file:///home/zsiga/proba/dcterms/dcterms.xsd"/>
</catalog>

Here's the NonEmpty2.xsd file:

<?xml version="1.0" encoding="UTF-8"?>
<schema 
    xmlns="http://www.w3.org/2001/XMLSchema"
    targetNamespace="http://www.example.org/Empty"
    elementFormDefault="qualified">
  <include schemaLocation="Empty.xsd"/>
  <import schemaLocation="http://dublincore.org/schemas/xmls/qdc/dcterms.xsd" namespace="http://purl.org/dc/terms/"/>
</schema>

And its usage:

$ XML_CATALOG_FILES=./catalog xmllint -debugent -noout \
    -schema NonEmpty2.xsd Empty.xml
new input from file: NonEmpty2.xsd
new input from file: Empty.xsd
new input from file: file:///home/zsiga/proba/dcterms/dcterms.xsd
new input from file: file:///home/zsiga/proba/dcterms/2001_03_xml.xsd
new input from file: file:///home/zsiga/proba/dcterms/dc.xsd
new input from file: file:///home/zsiga/proba/dcterms/dcmitype.xsd
new input from file: Empty.xml
Empty.xml validates

--- Edit 2020.11.02. ---

I would like to suggest using <systemId> tag in catalog, also using relative path-names:

<?xml version="1.0"?>
<!DOCTYPE catalog PUBLIC "-//OASIS//DTD Entity Resolution XML Catalog V1.0//EN" "http://www.oasis-open.org/committees/entity/release/1.0/catalog.dtd">
<catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog">
  <system systemId="http://www.w3.org/2001/03/xml.xsd"                  uri="2001_03_xml.xsd"/>
  <system systemId="http://dublincore.org/schemas/xmls/qdc/dcterms.xsd" uri="dcterms.xsd"/>
</catalog>

The result is the same, but some programs prefer <system> over <uri>. also relative path-names [relative to the location of the catalog file] might be easier to handle.

Iq answered 11/10, 2018 at 13:28 Comment(0)
F
2
  1. I don't have title element in the dcterms, so I replaced it with abstract
  2. I can't find any confirmation, but other people also report problems with using catalog files for xsd schemas in libxml. I found catalogs working ok for dtds though.
  3. There is a workaround. Insert <import namespace="http://purl.org/dc/terms/" schemaLocation="dcterms.xsd" /> into Empty.xsd. After that I got rid of No matching global message.
  4. No DTD found is still visible, but the return code increased from 3 to 4 and that means kind of successful parse.
  5. EDIT: --sax switch seems to help for "No DTD found" message.

Related question: Xml validation with schema header and Catalog lookup, no answer. That's about point 2.

Flavourful answered 5/9, 2012 at 16:57 Comment(1)
This answer is a mess. I had to read it like 10 times in order to understand the essence, and I might still have missed someting.Infuscate

© 2022 - 2024 — McMap. All rights reserved.