How to replace XML node contents using Nokogiri
Asked Answered
I

2

6

I'm using Ruby to read an XML document and update a single node, if it exists, with a new value.

http://www.nokogiri.org/tutorials/modifying_an_html_xml_document.html is not obvious to me how to change the node data, let alone how to save it back to the file.

def ammend_parent_xml(folder, target_file, new_file)
  # open parent XML file that contains file reference
  get_xml_files = Dir.glob("#{@target_folder}/#{folder}/*.xml").sort.select {|f| !File.directory? f}
  get_xml_files.each { |xml|

    f       = File.open(xml)

    # Use Nokgiri to read the file into an XML object
    doc     = Nokogiri::XML(f)
    filename  = doc.xpath('//Route//To//Node//FileName')

    filename.each_with_index {
      |fl, i|
      if target_file == fl.text
        # we found the file, now rename it to new_file
        # ???????
      end

    }

  }

end

This is some example XML:

<?xml version="1.0" encoding="utf-8">
    <my_id>123</my_id>
    <Route>
      <To>
        <Node>
          <Filename>file1.txt</Filename>
          <Filename>file2.mp3</Filename>
          <Filename>file3.doc</Filename>
          <Filename>file4.php</Filename>
          <Filename>file5.jpg</Filename>
        </Node>
      </To>
    </Route>
</xml>

I want to change "file3.doc" to "file3_new.html".

I would call:

def ammend_parent_xml("folder_location", "file3.doc", "file3_new.html")
Inositol answered 20/2, 2015 at 10:42 Comment(0)
C
4
def amend_parent_xml(folder, target_file, new_file)
  Dir["#{@target_folder}/#{folder}/*.xml"]
  .sort.select{|f| !File.directory? f }
  .each do |xml_file|
    doc = Nokogiri.XML( File.read(xml_file) )
    if file = doc.at("//Route//To//Node//Filename[.='#{target_file}']")
      file.content = new_file # set the text of the node
      File.open(xml_file,'w'){ |f| f<<doc }
      break
    end
  end
end

Improvements:

  • Use File.read instead of File.open so that you don't leave a file handle open.
  • Uses an XPath expression to find the SINGLE matching node by looking for a node with the correct text value.
    • Alternatively you could find all the files and then if file=files.find{ |f| f.text==target_file }
  • Shows how to serialize a Nokogiri::XML::Document back to disk.
  • Breaks out of processing the files as soon as it finds a matching XML file.
Corso answered 20/2, 2015 at 20:25 Comment(1)
OMG, what a great way to come into work on a Monday.. this is great and now I'm a little more clued up on how Nokogiri works!! (and I know how to spell Amend now too :p )Inositol
S
7

To change an element in the XML:

@doc = Nokogiri::XML::DocumentFragment.parse <<-EOXML
<body>
  <h1>OLD_CONTENT</h1>
  <div>blah</div>
</body>
EOXML


h1 = @doc.at_xpath "body/h1"
h1.content = "NEW_CONTENT"

puts @doc.to_xml   #h1 will be NEW_CONTENT

To save the XML:

file = File.new("xml_file.xml", "wb")
file.write(@doc)
file.close

There's a few things wrong with your sample XML.

  • There are two root elements my_id and Route
  • There is a missing ? in the first tag
  • Do you need the last line </xml>?

After fixing the sample I was able to get the element by using the example by Phrogz:

element = @doc.xpath("Route//To//Node//Filename[.='#{target_file}']").first 

Note .first since it will return a NodeSet.

Then I would update the content with:

element.content = "foobar"
Surrey answered 20/2, 2015 at 14:24 Comment(4)
OK.. in my example though, how does your code know which //Route//To//Node//FileName to update... there are several in my XML, which is why I'm looping through them allInositol
If you looking for a particular element in an xml document I would use xpath. Can you post an example xml document?Surrey
No need for binary mode when saving XML text files. Unless you wanted incorrect line endings on Windows?Corso
I added some more info to my answerSurrey
C
4
def amend_parent_xml(folder, target_file, new_file)
  Dir["#{@target_folder}/#{folder}/*.xml"]
  .sort.select{|f| !File.directory? f }
  .each do |xml_file|
    doc = Nokogiri.XML( File.read(xml_file) )
    if file = doc.at("//Route//To//Node//Filename[.='#{target_file}']")
      file.content = new_file # set the text of the node
      File.open(xml_file,'w'){ |f| f<<doc }
      break
    end
  end
end

Improvements:

  • Use File.read instead of File.open so that you don't leave a file handle open.
  • Uses an XPath expression to find the SINGLE matching node by looking for a node with the correct text value.
    • Alternatively you could find all the files and then if file=files.find{ |f| f.text==target_file }
  • Shows how to serialize a Nokogiri::XML::Document back to disk.
  • Breaks out of processing the files as soon as it finds a matching XML file.
Corso answered 20/2, 2015 at 20:25 Comment(1)
OMG, what a great way to come into work on a Monday.. this is great and now I'm a little more clued up on how Nokogiri works!! (and I know how to spell Amend now too :p )Inositol

© 2022 - 2024 — McMap. All rights reserved.