boost::property_tree XML pretty printing
Asked Answered
T

4

33

I'm using boost::property_tree to read and write XML configuration files in my application. But when I write the file the output looks kind of ugly with lots of empty lines in the file. The problem is that it's supposed to be edited by humans too so I'd like to get a better output.

As an example I wrote a small test program :

#include <boost/property_tree/ptree.hpp>
#include <boost/property_tree/xml_parser.hpp>

int main( void )
{
    using boost::property_tree::ptree;
    ptree pt;

    // reading file.xml
    read_xml("file.xml", pt);

    // writing the unchanged ptree in file2.xml
    boost::property_tree::xml_writer_settings<char> settings('\t', 1);
    write_xml("file2.xml", pt, std::locale(), settings);

    return 0;
}

file.xml contains:

<?xml version="1.0" ?>
<config>
    <net>
        <listenPort>10420</listenPort>
    </net>
</config>

after running the program file2.xml contains:

<?xml version="1.0" encoding="utf-8"?>
<config>



    <net>



        <listenPort>10420</listenPort>
    </net>
</config>

Is there a way to have a better output, other than going manually through the output and deleting empty lines?

Tango answered 4/7, 2011 at 14:5 Comment(2)
boost::property_tree uses an XML parser called RapidXML, rapidxml.sourceforge.net. Both boost::property_tree and RapidXML are maintained by Marcin Kalicinski. I suggest you contact him directly. You can find his mail address on the RapidXML home page.Unconscionable
thanks ildjarn for the edit, but the empty lines are here for a reason! Btw question asked to the maintainer, I'll post the answer if there is oneTango
T
48

The solution was to add the trim_whitespace flag to the call to read_xml:

#include <boost/property_tree/ptree.hpp>
#include <boost/property_tree/xml_parser.hpp>

int main( void )
{
    // Create an empty property tree object
    using boost::property_tree::ptree;
    ptree pt;

    // reading file.xml
    read_xml("file.xml", pt, boost::property_tree::xml_parser::trim_whitespace );

    // writing the unchanged ptree in file2.xml
    boost::property_tree::xml_writer_settings<char> settings('\t', 1);
    write_xml("file2.xml", pt, std::locale(), settings);

    return 0;
}

The flag is documented here but the current maintainer of the library (Sebastien Redl) was kind enough to answer and point me to it.

Tango answered 7/7, 2011 at 17:7 Comment(5)
Warning: trim_whitespace not only trims whitespace in the XML, but also whitespace in any element that doesn't contain other elements: <a>xx </a> is read as if it was <a>xx</a>.Freckly
It is strange that one needs to change the read settings to get this (specially after @AndreasHaferburg comment). Anyway in the current version of Boost one needs to use xml_writer_settings<std::string> (not char).Urgency
updated "here" link: boost.org/doc/libs/1_58_0/doc/html/boost/property_tree/…Urgency
There's a gotcha with using trim_whitespace when reading: it does more than trim leading and trailing whitespace; it also collapses multiple spaces into a single space. For example <a>BEGINxxxxEND</a> (where x is a space character) gets collapsed to <a>BEGINxEND</a>. This is because internally trim_whitespace gets expanded to parse_normalize_whitespace | parse_trim_whitespace in xml_parser_read_rapidxml.hpp. We ended up having to hack boost to add a new flag that disables the normalization, because otherwise it broke round-tripping of data in our application.Desk
this almost worked for me as is. I used different "settings" to make it work. const xml_writer_settings< typename Ptree::key_type > settings('\t', 1);.Photograph
W
4

This question is quite old, but I investigated your problem again, lately, because it got a lot worse now that property_tree translates newlines to

&#10;    

In my opinion this is a bug, because elements, which contains only whitespace - newlines, spaces and tabs, are treated as text elements. trim_whitespace is only a bandaid and normalizes ALL whitespace in the property_tree.

I reported the bug over here and also attached a .diff to fix this behaviour in Boost 1.59 in case trim_whitespace is not used: https://svn.boost.org/trac/boost/ticket/11600

Warbler answered 3/9, 2015 at 8:15 Comment(0)
O
3

For those trying:

boost::property_tree::xml_writer_settings<char> settings('\t', 1);

Compiling with boost-1.60.0 in VisualStudio 2013 you may get:

vmtknetworktest.cpp(259) : see reference to class template instantiation 'boost::property_tree::xml_parser::xml_writer_settings<char>' being compiled
install\include\boost-1_60\boost/property_tree/detail/xml_parser_writer_settings.hpp(38): error C2039: 'value_type' : is not a member of '`global namespace''
install\include\boost-1_60\boost/property_tree/detail/xml_parser_writer_settings.hpp(38): error C2146: syntax error : missing ';' before identifier 'Ch'
install\include\boost-1_60\boost/property_tree/detail/xml_parser_writer_settings.hpp(38): error C4430: missing type specifier - int assumed. Note: C++ does not support default-int
install\include\boost-1_60\boost/property_tree/detail/xml_parser_writer_settings.hpp(40): error C2061: syntax error : identifier 'Ch'
install\include\boost-1_60\boost/property_tree/detail/xml_parser_writer_settings.hpp(49): error C2146: syntax error : missing ';' before identifier 'indent_char'
install\include\boost-1_60\boost/property_tree/detail/xml_parser_writer_settings.hpp(49): error C4430: missing type specifier - int assumed. Note: C++ does not support default-int
install\include\boost-1_60\boost/property_tree/detail/xml_parser_writer_settings.hpp(50): error C2825: 'Str': must be a class or namespace when followed by '::'
install\include\boost-1_60\boost/property_tree/detail/xml_parser_writer_settings.hpp(50): error C2039: 'size_type' : is not a member of '`global namespace''
install\include\boost-1_60\boost/property_tree/detail/xml_parser_writer_settings.hpp(50): error C2146: syntax error : missing ';' before identifier 'indent_count'
install\include\boost-1_60\boost/property_tree/detail/xml_parser_writer_settings.hpp(50): error C4430: missing type specifier - int assumed. Note: C++ does not support default-int
vmtknetworktest.cpp(259): error C2661: 'boost::property_tree::xml_parser::xml_writer_settings<char>::xml_writer_settings' : no overloaded function takes 3 arguments

Then end up here:

https://svn.boost.org/trac/boost/ticket/10272

Solution to found to work is to use std::string in template.

pt::write_xml(file_name, params, std::locale(), pt::xml_writer_make_settings< std::string >(' ', 4));

as described here:

https://mcmap.net/q/453082/-no-end-of-line-in-boost-property-tree-xml-writer-output

Occupy answered 6/12, 2016 at 17:1 Comment(1)
In my case (Boost 1.81.0) I had to type: boost::property_tree::xml_writer_settings<std::string> settings(...); - use std::string as template parameter.Bijouterie
M
0

Settings to 'trim_whitespace' is not the right answer here. Trimming white-space upon read is of no use if you can't trim data items (which happens when you read)..

I think what's wrong is here, in: https://www.boost.org/doc/libs/1_81_0/boost/property_tree/detail/xml_parser_write.hpp

And instead of:

            // Write data text, if present
            if (!pt.data().empty())
                write_xml_text(stream,
                    pt.template get_value<Str>(),
                    indent + 1, has_elements && want_pretty, settings);

there could perhaps be something like:

            // Write data text, if not empty/white-space only
            auto d = pt.data();
            bool is_empty = d.erase(d.find_last_not_of(" \n\r\t")+1).empty();
            if (!is_empty)
                write_xml_text(stream,
                    pt.template get_value<Str>(),
                    indent + 1, has_elements && want_pretty, settings);

And that seems to fix the 'weird' behaviour seen above IMO - no empty or 'space-only' new-lines are added by the writer.

Alternatively - the reader could skip this empty 'text' when reading nodes, e.g.:

       // Parse contents of the node - children, data etc.
        template<int Flags>
        void parse_node_contents(Ch *&text, xml_node<Ch> *node)
        {
            // For all children and text
            while (1)
            {
                // Skip whitespace between > and node contents
                Ch *contents_start = text;      // Store start of node contents before whitespace is skipped
                // ****--***-> here - unconditonally (always) skip the WS
                // if (Flags & parse_trim_whitespace)
                    skip<whitespace_pred, Flags>(text);
                Ch next_char = *text;
Millstream answered 4/1, 2023 at 17:2 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.