tag0 namespace added for elements in default namespace
Asked Answered
G

3

18

I'm trying to parse and modify a Maven's pom.xml using Groovy's XmlSlurper. My pom.xml declares the namespace xsi.

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" 
 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
 xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
     http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>a-group-id</groupId>
<artifactId>an-artifact-id</artifactId>

My Groovy source is as follows:

import groovy.xml.XmlUtil
def pom = new XmlSlurper().parse('pom.xml')
   .declareNamespace('': 'http://maven.apache.org/POM/4.0.0',
      xsi: 'http://www.w3.org/2001/XMLSchema-instance')
//manipulate the pom
println XmlUtil.serialize(pom)

As you notice, I've declared the first namespace as empty. However in the output tag0 is added everywhere.

<?xml version="1.0" encoding="UTF-8"?>
<tag0:project xmlns:tag0="http://maven.apache.org/POM/4.0.0"
 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
 xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
      http://maven.apache.org/maven-v4_0_0.xsd">
<tag0:modelVersion>4.0.0</tag0:modelVersion>
<tag0:groupId>a-group-id</tag0:groupId>
<tag0:artifactId>an-artifact-id</tag0:artifactId>

How to avoid that?

For the moment my workaround is removing the tags manually:

println XmlUtil.serialize(pom).replaceAll('tag0:', '').replaceAll(':tag0', '')
Genera answered 8/2, 2012 at 16:39 Comment(5)
Is constructing the XmlSlurper with no namespace support godd enough? ie: println XmlUtil.serialize( new XmlSlurper( false, false ).parse( 'pom.xml' ) ) ?Pamplona
wow, yes, that was already enough, thank you Tim. can you provide it as an answer? Also I've noticed that all comments in XML are lost, do you know any workaround for it? BTW, here are the two utilities I wrote pomRm and pomVersions.Genera
Can't see how to keep comments at the moment... :-( I'll have a think if I get a free moment this afternoon...Pamplona
it could be a problem of XmlSlurper, in "Groovy and Grails Recipes" book it says "XmlSlurper is mainly intended for read-only operations". Maybe I should try with XmlParser. However, you've already replied my original question, if, when you have time, you post the reply you've already given in the comments, I will accept it.Genera
Added it as an answer... I found something in the mailing list that extends XmlParser and seems to be heading the direction we want, but so far no joy of getting it to work :-/ Not sure if it's a parsing problem or one of serialization...Pamplona
P
25

You can construct the XmlSlurper with no namespace awareness like so:

import groovy.xml.XmlUtil

def pom = new XmlSlurper( false, false ).parse( 'pom.xml' )
println XmlUtil.serialize(pom)

Which should give you the answer you want... No idea currently about how to maintain comments during the slurp/serialize cycle :-(

As you say, it might be possible with XmlParser, but my current attempts have failed :-( There's some code here which might get you close, but as yet I've had no success :-(

Pamplona answered 9/2, 2012 at 16:6 Comment(2)
Thank you Tim, this works, for the comments issue I will try in the weekend.Genera
CommentCollectingParser works to find the comments preceding a node, so for instance, to print a comment: def parser = new CommentCollectingParser(); def root = parser.parse(new File('plan.xml')); println parser.commentsFor(root.week[0]); However, if I try to print the whole XML, they are not included. def writer = new StringWriter(); new XmlNodePrinter(new PrintWriter(writer)).print(root); println writer.toString(); In XmlParser JavaDoc, in fact they say: This parser ignores comments and processing instructionsGenera
S
4

I had the same issue with "tag0" getting added to elements that didn't define a namespace (i.e they were in the "no namespace" namespace). I fixed this by adding

declareNamespace('': '')

which resets elements from being in the default namespace to being in the "no namespace" namespace.

Stereophonic answered 14/2, 2014 at 17:12 Comment(0)
M
1

I found that it is better to use XmlParser rather than XmlSlurper if you are dealing with namespaces and having the tag0 problem. Syntactically they seem the same, eg:

def root = new XmlParser().parse(new File('example.xml'))
println XmlUtil.serialize(root)

The above code would output the example.xml exactly as it should be including namespaces.

If you want to process the root in some way, eg find a specific node, use the Groovy API and output the result, eg

def root = new XmlParser().parse(new File('example.xml')
def result = root."ns:Element"[0]
println XmlUtil.serialize(result)
Mithras answered 19/5, 2012 at 10:17 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.