I have an XML file (e.g. jerry.xml) which contains some data as given below.
<data>
<country name="Peru">
<rank updated="yes">2</rank>
<language>english</language>
<currency>1.21$/kg</currency>
<gdppc month="06">141100</gdppc>
<gdpnp month="10">2.304e+0150</gdpnp>
<neighbor name="Austria" direction="E"/>
<neighbor name="Switzerland" direction="W"/>
</country>
<country name="Singapore">
<rank updated="yes">5</rank>
<language>english</language>
<currency>4.1$/kg</currency>
<gdppc month="05">59900</gdppc>
<gdpnp month="08">1.9e-015</gdpnp>
<neighbor name="Malaysia" direction="N"/>
</country>
I extracted the full paths of some selected texts from the xml above using the code below. The reasons are given in this post.
def extractNumbers(path, node):
nums = []
if 'month' in node.attrib:
if node.attrib['month'] in ['05', '06']:
return nums
path += '/' + node.tag
if 'name' in node.keys():
path += '=' + node.attrib['name']
elif 'year' in node.keys():
path += ' ' + 'month' + '=' + node.attrib['month']
try:
num = float(node.text)
nums.append( (path, num) )
except (ValueError, TypeError):
pass
for e in list(node):
nums.extend( extractNumbers(path, e) )
return nums
tree = ET.parse('jerry.xml')
nums = extractNumbers('', tree.getroot())
print len(nums)
print nums
This gives me the location of the elements I need to change as shown in colomn 1 of the csv below (e.g. hrong.csv).
Path Text1 Text2 Text3 Text4 Text5
'/data/country name=singapore/gdpnp month=08'; 5.2e-015; 2e-05; 8e-06; 9e-04; 0.4e-05;
'/data/country name=peru/gdppc month=06'; 0.04; 0.02; 0.15; 3.24; 0.98;
I would like to replace the text of the elements of the original XML file (jerry.xml) by those in column 2 of the hrong.csv above, based on the location of the elements in column 1.
I am a newbie to python and realize I might not be using the best approach. I would appreciate any help regards direction wrt this. I basically need to parse only some selected texts nodes of an xml file, modify the selected text nodes and save each file.
Thanks