Parsing XML Attributes with Boost
Asked Answered
C

3

18

I would like to share with you an issue I'm having while trying to process some attributes from XML elements in C++ with Boost libraries (version 1.52.0). Given the following code:

#define ATTR_SET ".<xmlattr>"
#define XML_PATH1 "./pets.xml"

#include <iostream>
#include <string>
#include <boost/foreach.hpp>
#include <boost/property_tree/ptree.hpp>
#include <boost/property_tree/xml_parser.hpp>

using namespace std;
using namespace boost;
using namespace boost::property_tree;

const ptree& empty_ptree(){
    static ptree t;
    return t;
}

int main() {
    ptree tree;
    read_xml(XML_PATH1, tree);
    const ptree & formats = tree.get_child("pets", empty_ptree());
    BOOST_FOREACH(const ptree::value_type & f, formats){
        string at = f.first + ATTR_SET;
        const ptree & attributes = formats.get_child(at, empty_ptree());
        cout << "Extracting attributes from " << at << ":" << endl;
        BOOST_FOREACH(const ptree::value_type &v, attributes){
            cout << "First: " << v.first.data() << " Second: " << v.second.data() << endl;
        }
    }
}

Let's say I have the following XML structure:

<?xml version="1.0" encoding="utf-8"?>
<pets>
    <cat name="Garfield" weight="4Kg">
        <somestuff/>
    </cat>
    <dog name="Milu" weight="7Kg">
        <somestuff/>
    </dog>
    <bird name="Tweety" weight="0.1Kg">
        <somestuff/>
    </bird>
</pets>

Therefore, the console output I'll get will be the next:

Extracting attributes from cat.<xmlattr>:
First: name Second: Garfield
First: weight Second: 4Kg
Extracting attributes from dog.<xmlattr>:
First: name Second: Milu
First: weight Second: 7Kg
Extracting attributes from bird.<xmlattr>:
First: name Second: Tweety
First: weight Second: 0.1Kg

However, if I decide to use a common structure for every single element laying down from the root node (in order to identify them from their specific attributes), the result will completely change. This may be the XML file in such case:

<?xml version="1.0" encoding="utf-8"?>
<pets>
    <pet type="cat" name="Garfield" weight="4Kg">
        <somestuff/>
    </pet>
    <pet type="dog" name="Milu" weight="7Kg">
        <somestuff/>
    </pet>
    <pet type="bird" name="Tweety" weight="0.1Kg">
        <somestuff/>
    </pet>
</pets>

And the output would be the following:

Extracting attributes from pet.<xmlattr>:
First: type Second: cat
First: name Second: Garfield
First: weight Second: 4Kg
Extracting attributes from pet.<xmlattr>:
First: type Second: cat
First: name Second: Garfield
First: weight Second: 4Kg
Extracting attributes from pet.<xmlattr>:
First: type Second: cat
First: name Second: Garfield
First: weight Second: 4Kg

It seems the number of elements hanging from the root node is being properly recognized since three sets of attributes have been printed. Nevertheless, all of them refer to the attributes of the very first element...

I'm not an expert in C++ and really new to Boost, so this might be something I'm missing with respect to hash mapping processing or so... Any advice will be much appreciated.

Culdesac answered 23/12, 2012 at 11:9 Comment(1)
this old question,i have one query,do u know how to get the encoding value from the ptree fro above xml?Diffident
S
20

The problem with your program is located in this line:

const ptree & attributes = formats.get_child(at, empty_ptree());

With this line you are asking to get the child pet.<xmlattr> from pets and you do this 3 times independently of whichever f you are traversing. Following this article I'd guess that what you need to use is:

const ptree & attributes = f.second.get_child("<xmlattr>", empty_ptree());

The full code, that works with both your xml files, is:

#define ATTR_SET ".<xmlattr>"
#define XML_PATH1 "./pets.xml"

#include <iostream>
#include <string>
#include <boost/foreach.hpp>
#include <boost/property_tree/ptree.hpp>
#include <boost/property_tree/xml_parser.hpp>

using namespace std;
using namespace boost;
using namespace boost::property_tree;

const ptree& empty_ptree(){
    static ptree t;
    return t;
}

int main() {
    ptree tree;
    read_xml(XML_PATH1, tree);
    const ptree & formats = tree.get_child("pets", empty_ptree());
    BOOST_FOREACH(const ptree::value_type & f, formats){
        string at = f.first + ATTR_SET;
        const ptree & attributes = f.second.get_child("<xmlattr>", empty_ptree());
        cout << "Extracting attributes from " << at << ":" << endl;
        BOOST_FOREACH(const ptree::value_type &v, attributes){
            cout << "First: " << v.first.data() << " Second: " << v.second.data() << endl;
        }
    }
}
Sadonia answered 23/12, 2012 at 12:2 Comment(1)
Thanks a lot llonesmiz, that worked perfectly for me. Best regards!Culdesac
S
2

Without ever using this feature so far, I would suspect that boost::property_tree XML parser isn't a common XML parser, but expects a certain schema, where you have exactly one specific tag for one specific property.

You might prefer to use other XML parsers that provides parsing any XML schema, if you want to work with XML beyond the boost::property_tree capabilities. Have a look at e.g. Xerces C++ or Poco XML.

Segarra answered 23/12, 2012 at 11:49 Comment(2)
Both libraries sound interesting. I'll probably give them a try at some point in the next weeks. Thanks g-makulik!Culdesac
BTW, have you ever tried TinyXML? What do you think about it?Culdesac
A
1

File to be parsed, pets.xml

<pets>
    <pet type="cat" name="Garfield" weight="4Kg">
        <something name="test" value="*"/>
         <something name="demo" value="@"/>
    </pet>
    <pet type="dog" name="Milu" weight="7Kg">
         <something name="test1" value="$"/>
    </pet>
    <birds type="parrot">
        <bird name="african grey parrot"/>
        <bird name="amazon parrot"/>
    </birds>
</pets>

code:

// DemoPropertyTree.cpp : Defines the entry point for the console application.
//Prerequisite boost library

#include "stdafx.h"
#include <boost/property_tree/xml_parser.hpp>
#include <boost/property_tree/ptree.hpp>
#include <boost/foreach.hpp>
#include<iostream>
using namespace std;
using namespace boost;
using namespace boost::property_tree;

void processPet(ptree subtree)
{
    BOOST_FOREACH(ptree::value_type petChild,subtree.get_child(""))
    {
        //processing attributes of element pet
        if(petChild.first=="<xmlattr>")
        {
            BOOST_FOREACH(ptree::value_type petAttr,petChild.second.get_child(""))
            {
                cout<<petAttr.first<<"="<<petAttr.second.data()<<endl;
            }
        }
        //processing child element of pet(something)
        else if(petChild.first=="something")
        {
            BOOST_FOREACH(ptree::value_type somethingChild,petChild.second.get_child(""))
            {
                //processing attributes of element something
                if(somethingChild.first=="<xmlattr>")
                {
                    BOOST_FOREACH(ptree::value_type somethingAttr,somethingChild.second.get_child(""))
                    {
                        cout<<somethingAttr.first<<"="<<somethingAttr.second.data()<<endl;
                    }
                }
            }
        }
    }
}
void processBirds(ptree subtree)
{
    BOOST_FOREACH(ptree::value_type birdsChild,subtree.get_child(""))
    {
        //processing attributes of element birds
        if(birdsChild.first=="<xmlattr>")
        {
            BOOST_FOREACH(ptree::value_type birdsAttr,birdsChild.second.get_child(""))
            {
                cout<<birdsAttr.first<<"="<<birdsAttr.second.data()<<endl;
            }
        }
        //processing child element of birds(bird)
        else if(birdsChild.first=="bird")
        {
            BOOST_FOREACH(ptree::value_type birdChild,birdsChild.second.get_child(""))
            {
                //processing attributes of element bird
                if(birdChild.first=="<xmlattr>")
                {
                    BOOST_FOREACH(ptree::value_type birdAttr,birdChild.second.get_child(""))
                    {
                        cout<<birdAttr.first<<"="<<birdAttr.second.data()<<endl;
                    }
                }
            }
        }
    }
}
int _tmain(int argc, _TCHAR* argv[])
{

    const std::string XML_PATH1 = "C:/Users/10871/Desktop/pets.xml";
    ptree pt1;
    boost::property_tree::read_xml( XML_PATH1, pt1  );
     cout<<"********************************************"<<endl;
    BOOST_FOREACH( ptree::value_type const& topNodeChild, pt1.get_child( "pets" ) ) 
    {
        ptree subtree = topNodeChild.second;
        if( topNodeChild.first == "pet" ) 
        {
             processPet(subtree);
             cout<<"********************************************"<<endl;
        }
        else if(topNodeChild.first=="birds")
        {
            processBirds(subtree);
             cout<<"********************************************"<<endl;
        }

    }
    getchar();
    return 0;
}

The output is shown here: output

Alloplasm answered 16/5, 2019 at 3:36 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.