Parsing xml with Boost

Everyone would agree that xml is a rather clumsy format for transferring data, however one essential thing about xml is that it has become a de facto standard for passing information, and lots of platforms and libraries provide tools for parsing the format. One of such tools is also available in Boost libraries. It is easy to overlook, because it is not advertised under name “xml”, therefore I thought it deserves a post in my blog.

The library that we will need is Boost.PropertyTree, developed by Marcin Kalicinski. Its main focus is not parsing xml, but an observation that many data formats share the same general tree structure, that can conceptually, quoting the description of the library author, be described as follows.

struct ptree
{
   string data;                          // data associated with the node
   list< pair<string, ptree> > children; // ordered list of named children
};

Xml just happens to fall into this category of data structures. Boost.PropertyTree allows building data structure ptree from an xml file; then you can manually inspect the structure and collect the information you need. The library also allows building and modifying ptree and storing it into an xml file. Technically, the library is not 100% compatible with W3C specifications, but it is sufficient to correctly process data stored in form of xml file. While the library itself is very good, it doesn’t fully document its capabilities, so we will try to see by an example how such xml processing can be performed.

Reading xml

First, we will want to parse the following document.

<?xml version="1.0"?>
<sked>
  <version>2</version>
  <flight>
    <carrier>BA</carrier>
    <number>4001</number>
    <date>2011-07-21</date>
  </flight>
  <flight cancelled="true">
    <carrier>BA</carrier>
    <number>4002</number>
    <date>2011-07-21</date>
  </flight>
</sked>

The goal is to put the information from the file into the following data structure Sked.

typedef boost::gregorian::date Date;

struct Flight
{
    std::string  carrier;
    unsigned     number;
    Date         date;
    bool         cancelled;
};

typedef std::vector<Flight> Sked;

We start with the following code. It is explained in detail further below.

#include <boost/property_tree/xml_parser.hpp>
#include <boost/property_tree/ptree.hpp>

Sked read( std::istream & is )
{
    // populate tree structure pt
    using boost::property_tree::ptree;
    ptree pt;
    read_xml(is, pt);

    // traverse pt
    Sked ans;
    BOOST_FOREACH( ptree::value_type const& v, pt.get_child("sked") ) {
        if( v.first == "flight" ) {
            Flight f;
            f.carrier = v.second.get<std::string>("carrier");
            f.number = v.second.get<unsigned>("number");
            f.date = v.second.get<Date>("date");
            f.cancelled = v.second.get("<xmlattr>.cancelled", false);
            ans.push_back(f);
        }
    }

    return ans;
}

We did not include all the necessary header files, only those dealing with property tree. First, we create an empty ptree and populate it using function read_xml. It is defined in namespace boost::property_tree, but due to argument dependent look-up it can be used without any namespace qualification. We feed our function with std::istream rather than std::string or std::ifstream, because it is more generic an interface: it can accommodate both strings (in case we get the xml from some networking library) and files.

Once the reading of xml file is done we can begin querying pt for contents. We get the top-level child sked and iterate over its children. We pick only those with key flight, thus skipping the child version. We read data using function template get; we also specify the type we want to cast our data to. PropertyTree uses streaming operators << and >> to convert any type to and from string, much like boost::lexical_cast. In case the data element we read is optional, we can use get_optional which will return boost::optional rather than our target type directly. In case of attributes, it gets a bit more complicated. Structure ptree does not have a concept of an attribute, so xml attributes are just sub-elements of a special element <xmlattr>. For our attribute cancelled we use a third way of getting the data: we provide a default value that should be returned: false. this is similar to boost::optional’s get_value_or. Note that we didn’t have to provide the template parameter, because it could be deduced from the default value.

PropertyTree customizations

The above example has one significant fault. If we try to parse the xml file we listed above, our function fails, and throws an exception. This is because type Date, although it provides streaming operators, uses different string formats than xml file. Default string format for Date is "2011-Jul-30". What we need it extended ISO date format: "2011-07-30". If in a hurry, we could solve the problem by setting the global locale object to the one that uses our date format. However, this may spoil other parts of our program that rely on different date format. We need a more localized solution that will change the date format only for xml parser. Boost.Property allows easy per-type customizations. We will need to define the following type.

class DateTranslator 
{
    std::locale locale_; // implementation detail

public:
    typedef std::string  internal_type;
    typedef Date         external_type;

    DateTranslator();
    boost::optional<external_type> get_value( internal_type const &v );
    boost::optional<internal_type> put_value( external_type const& v );
};

Our ‘translator’ is capable of encoding Date into std::string and vice-versa. boost::optional in return value is used for signalling conversion errors. You need to define two types internal_type and external_type, even if they are spelled longer than the original types. It is based on those two types that the framework determines (using boost::enable_if) that our class is a translator. Now, we need to define the two conversions. It is up to you how you want to implement the conversions. Here, we chose to do that by using a temporary stream for which we will customize the locale. First a function that returns the desired locale:

std::locale isoDateLocale() {
    typedef boost::date_time::date_facet<Date, char> tOFacet;
    typedef boost::date_time::date_input_facet<Date, char> tIFacet;

    std::locale loc;
    loc = std::locale( loc, new tIFacet("%Y-%m-%d") );
    loc = std::locale( loc, new tOFacet("%Y-%m-%d") );
    return loc;
}

We need to add two facets to our locale: one for reading dates, the other for writing. First we default-create a locale; it is a copy of the global locale. Then, we add the two facets that come with Boost.Date_Time library. In the default constructor we use this function to initialize our private object locale_. Next, we define the two conversion functions by creating a string-based stream, imbuing our locale, and just doing the regular streaming:

boost::optional<DateTranslator::external_type> 
DateTranslator::get_value(internal_type const& v)
{ 
    std::istringstream stream(v);
    stream.imbue(locale_);
    external_type vAns;
    if( stream >> vAns ) {
        return vAns;
    }
    else {
        return boost::none;
    }
}

boost::optional<DateTranslator::internal_type> 
DateTranslator::put_value(external_type const& v)
{ 
    std::ostringstream ans;
    ans.imbue(locale_);
    ans << v;
    return ans.str();
}

Now, in order to use our new translator in PropertyTree we have two options. Either we create this object and pass it to function get as an additional parameter:

DateTranslator tr;
f.date = v.second.get<Date>( "date", tr );

Or, we can teach the framework that it should always use our translator internally without providing the additional argument each time. We do it by specializing class template translator_between:

namespace boost{ namespace property_tree{
    template<> 
    struct translator_between<std::string, Date>
    {
        typedef DateTranslator type;
    };
}}

Now our original call will work correctly:

f.date = v.second.get<Date>("date");

Generating xml

Now, for the last step, we will write a function that given the schedule of type Sked, will write an xml file into a stream.

void write( Sked sked, std::ostream & os )
{
    using boost::property_tree::ptree;
    ptree pt;

    pt.add("sked.version", 3);

    BOOST_FOREACH( Flight f, sked ) {
        ptree & node = pt.add("sked.flight", "");

        node.put("carrier", f.carrier);
        node.put("number", f.number);
        node.put("date", f.date);
        if( f.cancelled ) node.put("<xmlattr>.cancelled", true);
    }

    write_xml( os, pt );
}

The full working example of the xml parser is here. Or if you prefer a PDF file: parse_xml.pdf.

This entry was posted in programming and tagged , . Bookmark the permalink.

49 Responses to Parsing xml with Boost

  1. Very nice article!

    When I read the headline, I assumed you would do what I would do: reach for something heavy like boost.spirit. I’ll have to take a deeper look into the property_tree library, maybe I can use it for my current project: parsing a database dump of network traffic into a data structure and then generating a gnuplot output to make some pretty graphs!

  2. Demetrio says:

    Hello. I am getting this error: “terminate called after throwing an instance of ‘boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector >’
    what(): : read error”
    I copied and paste the pdf code, included the boost and I copied the xml file to debug and src Folder. It sounds to me that the code can’t find the input.xml.
    My knowledge is very limited about C++ and even more about Boost. Could you give me some idea about this error? I am using Eclipse Indigo.
    Less important, I am ignoring a warnnig “Multiple markers at this line…” on left of BOOST_FOREACH but I belive that this is something about Eclipse editor because I can build the application and I am able to debug.

    • Hi Demetrio!
      Your problem is indeed caused by the program not being able to open the input file for reading. Perhaps the file is in the wrong location, or not readable. Try specifying a full path to the file (in the example the path is relative). If you do not want to play with the filesystem at the moment, but only want to check the included program read xml from a string rather than from file. In order to do that you have to replace function main() from the example with the following one:

      # include <sstream>
      int main()
      {
          std::string text = 
              "<?xml version=\'1.0\'?>"
              "<sked>"
              "<version>2</version>"
              "<flight>"
              "<carrier>BA</carrier>"
              "<number>4001</number>"
              "<date>2011-07-21</date>"
              "</flight>"
              "<flight cancelled=\'true\'>"
              "<carrier>BA</carrier>"
              "<number>4002</number>"
              "<date>2011-07-21</date>"
              "</flight>"
              "</sked>";
      
          std::istringstream input( text );
          Sked sked = read( input );
          std::ofstream output("output.xml");
          write( sked, output );
          return 0;
      }
      
  3. westfork says:

    Hi
    Nice article. I’m implementing some examples with Boost property tree. I’m not able to read nested items in my simple xml file:

    void test ()
    {
      using boost::property_tree::ptree;
      ptree pt;
    
      std::string filename ("test.xml");
      try {
        boost::property_tree::xml_parser::read_xml (filename, pt);
    
        BOOST_FOREACH (ptree::value_type const& vContainer, pt.get_child ("root"))
        {
          print_containerData (vContainer);
          BOOST_FOREACH (ptree::value_type const& vItem, pt.get_child ("root.container"))
          {
            print_itemData (vItem);
          }
          std::cout << "\t----------------------\n";
        }
        std::cout << "----------------------\n";
      }
      catch (std::exception& e)
      {
        std::cout << "[test] Error: " << e.what () << "\n";
      }
      return;
    }
    

    I read OK the containers, but not the items (always the same first 3).
    Thank you
    Bye westfork

    • Hi Westfork,
      It is difficult to see what your problem is without seeing the file you want to parse, and effect you want to achieve. I can only guess…

      Perhaps changing the two loops to the following would help:

      BOOST_FOREACH (ptree::value_type const& vContainer, pt.get_child ("root"))
      {
          if (vContainer.first == "container")
          {
              print_containerData (vContainer);
              BOOST_FOREACH (ptree::value_type const& vItem, vContainer.second)
              {
                  if (vItem.first == "item") // do you use elem <item>?
      

      Regards,
      &rzej

      • fduff says:
        BOOST_FOREACH (ptree::value_type const& vItem, vContainer.second)

        did the trick for accessing nested tags in a loop, thanks a lot!

  4. westfork says:

    Hi Andrzej
    Thank you for your answer. My xml is the following:
















    Kind regards
    westfork

  5. westfork says:

    I’m sorry, but I’m able to post the xml
    westfork

  6. westfork says:

    Hi Andrzej
    Thank you! Your answer drive me to solve the problem. Probably last night was too late to work!!!
    Here the code

    BOOST_FOREACH (ptree::value_type& vContainer, pt.get_child (“root”))
    {
    dump_containerData (vContainer);
    int i=0;
    BOOST_FOREACH (ptree::value_type& item, vContainer.second)
    {
    std::cout << "i: " << i << "\n";
    if (item.first == "item")
    {
    std::cout << "\tfound item! " << item.second.get (“.id”) << "\n";
    }
    ++i;
    }
    std::cout << "——————–\n";
    }

    Kind regards
    westfork

  7. vkumar says:

    Any idea how to handle special chars like & in attributes. I know we can escape them with &amp etc, but do you know if boost ptree xml handle them in attributes “..

    In my test they work fine if I add them as non-attributes.

  8. @vkumar: I uderstand tht for some reason, you would like to use and parse attribute names like “mary&john”. According to my knowlege chars like ampersand and angle brackets have special meaning in XML and using them in contexts like attribute names or values is simply illegal; i.e., text files thus created are not xml files. Browsers and parsers may parse them or not (because they are usually not aimed at checking 100% conformance) but you are getting yourself into trouble sooner or later when you do this.

  9. Steve says:

    I’ve got an xml file that looks something like this:

    <datum id=’0′ msb=’31′ lsb=’29′ type=’enum’>
    <key>IF SRCMSDP::SMR 1+<?sub value=’smr_num’?> Type</key>
    </datum>
    

    I'm having trouble figuring out how to use Boost to parse out the from within the key tag.

    Any thoughts?

    -steve

    • Just without trying it out, I notice that this looks like an invalid XML format. According to W3C Recommendation (paragraph 2.3) a question mark is not a valid namefor a tag. Was your goal to extract this piece of data:

      IF SRCMSDP::SMR 1+<?sub value=’smr_num’?> Type
      

      Perhaps the parser got confused with non-xml syntax. I will try it out soon myself.

      • Steve says:

        yeah, the intent is to do a text replace of the (?sub value=’smr_num’?) with what is contained in ‘smr_num’ which is defined in an attribute elsewhere in the file. The same source xml file I’m using is being parsed by another application, I just can’t figure out how it’s doing it. When I read in the “key” node, using your algorithm above, it just ignores whats in the ?sub part.

        So, for example, in an earlier node the attribute “smr_num” is defined to be equal to 0. The key I want to end up with would be “IF SCRCMSDP::SMR 1+0 Type.”

        -steve

    • @Steve:
      I believe that what you need is not doable with Boost.PropertyTree. The problem is that the above is not a valid XML file. When the library parses text in element<key>, it ignores <?sub value=’smr_num’?> because it looks like an element (due to angle brackets). Then when the library parses sub-elements of <key> it ignores <?sub value=’smr_num’?> again as this is not a valid element (it has neither name, nor a closing tag).

      It looks like your job is to change some ‘xml template’ into a valid xml. Perhaps what you need is some string transformation tool like Regular Expressions library?

      • Steve says:

        I was rapidly coming to that conclusion as well, Andrzej, but I thought I’d ask and see if you or anyone else had any ideas.

        I’ll take a lok at Regular Expressions and see if it’ll get me where I want to go.

        Thanks for the input, I really appreciate it!

        -steve

  10. Boris Grinac says:

    Hi, everyone! I have question to ptree:
    I have xml with duplicate keys, so I need to walk the tree to get may values. For this purpose I have done a non-recursive program to walk the tree. Everything is fine, but when debugging, I found that mytree->second.get_value(“”) returns a non-empty value for keys that have no value. There must be something that I missed: how do I find that my current key has no value?
    Can you help?

    • Hi Boris,
      It would be easier to understand your question, if you showed us an example of xml file you are parsing, the effect you want to achieve (ideally as a short snippet of code), and the effect that you get (which you find surprising).

  11. Boris Grinac says:

    The XML (please note that there are many Icon tags):

    <Icon>
      <LogIcon>
        <UserDiagramIcon>
          <IonIcon resource_id="71">
            <Caption display_method="none" caption_position="8"></Caption>
            <Location left="519" top="184" bottom="214" right="549"></Location>
            <UniqueId>9907934c-a5f1-4f52-9f21-2542aba3bbd6</UniqueId>
          </IonIcon>
        </UserDiagramIcon>
        <EvtLogTable priority_threshold="128">
          <DataLogTable>
            <Location left="117" top="48" bottom="671" right="1503"></Location>
            <WindowSetting>0</WindowSetting>
            <UniqueId>b2089446-4cae-4be8-afe1-4193d049e978</UniqueId>
            <UpdateQuery>0</UpdateQuery>
            <Columns>
              <ColumnWidth>140</ColumnWidth>
              <ColumnWidth>222</ColumnWidth>
            </Columns>
            <DeleteOnClose>1</DeleteOnClose>
            <XParam>0</XParam>
            <SQLQuery>SELECT  timestamp
            , cause_ion
    
            FROM   "--(*vendor(PML),product(LogServer) Global Event Log @Global *)--"
    
            ORDER BY timestamp DESC
            ,cause_ion ASC
            --  SQL generated with ION Enterprise version 7.0 build 12059</SQLQuery>
            <timestamp>-1</timestamp>
            <RecordId>-1</RecordId>
            <RecordsUploadedPerScroll>9999</RecordsUploadedPerScroll>
            <TrendedRegion left="0" top="0" bottom="0" right="0"></TrendedRegion>
          </DataLogTable>
          <Alarm_Stages>
            <Stage priority="128" min_security="" act_beep="" act_flash="">
            <Annunciation>
              <CommandLine></CommandLine>
              <MessageBox></MessageBox>
            </Annunciation>
            </Stage>
          </Alarm_Stages>
        </EvtLogTable>
      </LogIcon>
    </Icon>
    

    My code looks for UUID and needs to get the sql query out. Which it perfectly does, the only thing that I noticed is that for RTGGroupLcon, which is a tag with empty value, I get some characters, when i try to print debug messages, I use this code to get the value: current.it->second.get_value(“”)
    My whole function to walk the tree is here:

    namespace pml_ud {
    
    	using namespace boost;
    	using namespace boost::property_tree;
    	
      std::string ud_tree::get_query(std::string &id_string) {
           // convert string to uuid 
    	   uuid id,current_id;
    	   std::stringstream s(id_string);
    	   s >> id;
    	   // s << id;
    
    	   std::string sql;
    
           // walk the tree
    	   typedef ptree::const_iterator tree_iterator;
    	   typedef struct {
    		   tree_iterator it;
    		   tree_iterator end;
    	   } stack_item;
           std::vector stack; // my stack to eliminate recursion
    	   stack_item current;
    	   
           current.it=  pt.begin();
    	   current.end= pt.end();
    	   bool found=false;
    	   bool sql_level=false; // I am on correct level for sql
    
    	   // Diag message
    	   std::cerr << std::endl <<"**** Loop start *****"<< std::endl;
    
    	   while ( (!found) && (current.it!=current.end)) {
    
    		   /****** Debug print **********************
    		   // indent the level
    		   for (unsigned int i=0; i<stack.size(); i++) std::cerr << "-";
    		   std::cerr << "key: "<first << " value: "<second.get_value("") <first) >> current_id;
    		   if (current.it->first=="UniqueId"){
    			   std::stringstream( current.it->second.get_value("")) >> current_id;
    			   if (id==current_id) sql_level=true;
    		   };
    
    		   // Now when I have found my UUID, I am looking for my sql, which is located at the same level
    	       if (sql_level && (current.it->first=="SQLQuery")) {
    			   sql=current.it->second.get_value("");
    			   found=true;
    			   break;
    		   };
    
    		   // any children?
    		   if ( ! current.it->second.empty() )
    		   {
    			   stack.push_back(current);
    			   current.end=current.it->second.end();
    			   current.it= current.it->second.begin(); // we are one level down
    			   //std::cerr << "Changing to child: " <first << std::endl;
    			   
    		   } else 
    			   current.it++;  // go to next item
    
    			// we are at the end of current level, so go up. We may need to jump over many levels here
    			while ((current.it==current.end) && (!stack.empty())) {
    				current=stack.back();
    				stack.pop_back();
                    current.it++;  // go to next item one level higher 
    			}; 
    	   }; // while
    	   return sql;
    	}
    } // namespace
    
    • “RTGGroupLcon”? I cannot find it anywhere in the XML. Is it a non-existent tag whose value you still try to obtain?

      Perhaps it would be easier to reduce the XML and the code to the minimal one that still displays the problem? This way it would be easier to analyze the problem.

      • Boris Grinac says:

        Sorry, I have already cut a piece of my whole XML. To reproduce the problem, please use any empty tag in my XML, like LogIcon tag:

        <Icon>
        							<LogIcon>
        								<UserDiagramIcon>
        									<IonIcon resource_id="71">
        										<Caption display_method="none" caption_position="8"></Caption>
        										<Location left="519" top="184" bottom="214" right="549"></Location>
        										<UniqueId>9907934c-a5f1-4f52-9f21-2542aba3bbd6</UniqueId>
        									</IonIcon>
        								</UserDiagramIcon>
        								<EvtLogTable priority_threshold="128">
        									<DataLogTable>
        										<Location left="117" top="48" bottom="671" right="1503"></Location>
        										<WindowSetting>0</WindowSetting>
        										<UniqueId>b2089446-4cae-4be8-afe1-4193d049e978</UniqueId>
        										<UpdateQuery>0</UpdateQuery>
        										<Columns>
        											<ColumnWidth>140</ColumnWidth>
        											<ColumnWidth>222</ColumnWidth>
        										</Columns>
        										<DeleteOnClose>1</DeleteOnClose>
        										<XParam>0</XParam>
        										<SQLQuery>SELECT	 timestamp
        	, cause_ion
        
          FROM   "--(*vendor(PML),product(LogServer) Global Event Log @Global *)--"
        
          ORDER BY timestamp DESC
        	,cause_ion ASC
         --  SQL generated with ION Enterprise version 7.0 build 12059</SQLQuery>
        										<timestamp>-1</timestamp>
        										<RecordId>-1</RecordId>
        										<RecordsUploadedPerScroll>9999</RecordsUploadedPerScroll>
        										<TrendedRegion left="0" top="0" bottom="0" right="0"></TrendedRegion>
        									</DataLogTable>
        									<Alarm_Stages>
        										<Stage priority="128" min_security="" act_beep="" act_flash="">
        											<Annunciation>
        												<CommandLine></CommandLine>
        												<MessageBox></MessageBox>
        											</Annunciation>
        										</Stage>
        									</Alarm_Stages>
        								</EvtLogTable>
        							</LogIcon>
        						</Icon>
        
        • Boris, here is how I understand your problem. Given a simple XML like this:

          <a>
            Y
            <b>X</b>
          </a>
          

          You want to check if data “Y” exists, or if you have no string inside tag <A>

          If so, you can use the following code to check it:

          #include <string>
          #include <fstream>
          #include <boost/foreach.hpp>
          #include <boost/property_tree/xml_parser.hpp>
          #include <boost/property_tree/ptree.hpp>
          
          int main()
          {
              std::string text = "<a>Y<b>X</b></a>";
              std::istringstream input( text );
          
              using boost::property_tree::ptree;
              ptree pt;
              read_xml(input, pt);
          
              std::string value_of_a = pt.get<std::string>("a");
              std::cout << value_of_a << std::endl;
          }
          
  12. Boris Grinac says:

    Andrzej, thank you. It looks like this simple solution does not work for my XML, because I have duplicate tags on the same level. There are many Icon tags. So I will not find my path, get() does not work. This is the reason I did this crazy program to walk the tree. Your xml may be like this:

    <a>
      bla bla
      <b>X</b>
    </a>
    <a>
      la la la 
      <b>X</b>
    </a>
    <a>
      <uuid>76536854-dd66-4297-8606-066238b91c44</uuid>
      <b>my data</b>
    </a>
    

    Here the last tag a is empty, with no value. And the questions are:
    How do I know if the last tag a is empty?
    Is there a better way to walk this kind of tree and find my data?
    I am looking for specific UUID to get my data, as I tried to demonstrate in XML above.

  13. Ok, I see what you mean. Perhaps the following example helps. It looks for tags <a> with empty value and prints the value of the corresponding tag <b>.

    #include <string>
    #include <sstream>
    #include <iostream>
    #include <boost/foreach.hpp>
    #include <boost/property_tree/xml_parser.hpp>
    #include <boost/property_tree/ptree.hpp>
    
    int main()
    {
        std::string text = "<root>"
                           "  <a>Y1<b>X1</b></a>"
                           "  <a>Y2<b>X2</b></a>"
                           "  <a><b>X3</b></a>"
                           "</root>";
    
        std::istringstream input( text );
    
        using boost::property_tree::ptree;
        ptree pt;
        read_xml(input, pt);
    
        BOOST_FOREACH (ptree::value_type const&v, pt.get_child("root")) {
            if (v.first == "a") {
                std::string a = v.second.get<std::string>("");
                if (a.empty()) {
                    std::string b = v.second.get<std::string>("b");
                    std::cout << b << std::endl;
                }
            }
        }
    }
    
  14. MJT says:

    Andrzej. how do you handle the following case.

  15. Adriano says:

    Useful post. Thanks for the code!

  16. san says:

    Hi,
    How do i throw errors if a particular element is not found?

    for ex: in the above examples
    if (vContainer.first == “container”)

    if not even one “container” found, i want to throw a error.

    also how to read a exact element in a tree. ex:

    value

    i want to read x directly. something like.
    BOOST_FOREACH( ptree::value_type & currPTreeValue, pt.get_child(“root.root1.root2”) )
    {
    if( currPTreeValue.first == “x” )
    currPTreeValue.second.get(“logicalName”)
    }

    BUT with out that “if” statement

    • It is difficult to figure out what you mean. If you want your program to have some custom behavior, like checking if at least one element of a given name exists under root, I guess you have to write some custom code, and throw in your code.

      If you want to iterate over a range and filter out some elements, I guess you could use Boost’s filtered range adapter.

  17. san says:

    correction:
    if( currPTreeValue.first == “x” )
    currPTreeValue.second.get(“x”)

    my xml is value

  18. Rajat Girotra says:

    Hi

    I am having an issue using boost property tree. I want to generate XML, something like this:

    John
    Smith

    Mark
    Twain

    However when I do:

    boost::property_tree::ptree pt;
    pt.add(“organisation.employee.firstname”, “John”);
    pt.add(“organisation.employee.lastname”, “Smith”);
    pt.add(“organisation.employee.firstname”, “Mark”);
    pt.add(“organisation.employee.lastname”, “Twain”);

    I get:

    John
    Smith
    Mark
    Twain

    Will these two XML’s be counted as similar? I tried using the put() function too, but then I get:

    Mark
    Twain

    Could someone please advice?

    Regards
    Rajat

  19. Phu Nguyen says:

    Hi Andrzej,

    I have this quasi-XML file

    5, 29, 31
    29, 30, 32
    30, 6, 33

    7, 35, 37
    35, 36, 38
    36, 8, 39

    which I would like to read. This file was created by my code and I chose this format so that I can reuse any XML parser. After reading your post, I still do not know how to retrieve the data of a NodeGroup because it is a 2D array. It would be great if you could show me some code.

    Thanks a lot.

    Phu

  20. Phu Nguyen says:

    Hi Andrzej,

    Thanks for pointing out the Boost.Spirit. I will try it.

    Bests,
    Phu

  21. Hoang Vuong says:

    very useful post, thanks you very much!

  22. Matthew says:

    I spend an hour or two trying to figure out how to get libxml++ to work with Visual Studio, but the dependencies on Gnome kept growing. Imagine my surprise when I learned Boost had an xml parser! Most boost libraries are fairly well documented but as Andrzej notes above this one needed some more examples. After seeing this code in action I was up and running in ten minutes. Thanks so much!

  23. memic says:

    nice example thx

  24. alex says:

    Hi Andrzej! Thank you for this article!
    I was wondering, why would someone prefer Boost.PropertyTree over any “real” xml library?

    • For me, being a Boost library guarantees certain level of quality. There are many libraries out there in the web, but are they good? For instance, is the author aware that if something in the library throws an exception, there is a high risk of leaking resources, and if he does not know the ways of handling them, his library is simply has resource leaks.

      Having passed the Boost review process is a quick test about the competence of the author and the quality of the library. And also, often, when you start thinking about processing XML, Boost is already there: installed on your platform.

      That said, there are other good libraries, that do the full XML processing, like http://rapidxml.sourceforge.net/.

  25. I have been using Boost ptree largely because of the convenience it provides to parse config files. The client code becomes simple and readable.

    But because ptree is just a bunch of KV pairs, it doesn’t preserve comments when writing the contents back. This is severely limiting, as config files are meant to be human readable, mostly used by the people that don’t know anything about the system and would want to know how to tune the params and stuff.

    I’m not sure what other better alternative we have.

  26. Pingback: boost 学习笔记 7:property_tree - C++|后端开发 - IT世界123

  27. Domarius says:

    I keep coming back to this page to remind myself how it works. For me this is the place to go to learn/remember how to do XML in C++. Thank you for making this.

    • Thank you. I am glad you find it useful.

      • Domarius says:

        Only problem is, now that I’m writing the XML back, I find it completely butchers the layout. It’s “valid” semantically, but oh my god is it a visual mess. Tonnes of newlines added inexplicably in some places, and tonnes of newlines removed in other places, putting random sections of it onto a single line. It’s completely unreadable. And curiously, it butchers it up in the same way every time. I think I’m going to have to find another library…
        Here’s a screenshot of what I’m experiencing;
        https://gyazo.com/873d8947761c2072778e6f04309aedfa

        • Domarius says:

          I found a solution, a combination of a “trim whitespace” flag, and a “xml writer settings” thingy, both which can be found on this page;
          https://stackoverflow.com/questions/6572550/boostproperty-tree-xml-pretty-printing

          The “trim whitespace” solves the weird gaps, and the “xml writer settings” formats it correctly, so things don’t randomly end up on one line. Apparently a downside of this is “trim whitespace” flag will remove consecutive spaces even within values (which is bad, BAD behaviour, it should not be touching content) but for what I’m doing, this will work.

          Honestly I’m surprised that the default output is so wonky. I get that the author sees ptree as a generic tree structure of data and not specifically XML, but why make write_xml produce such wacky output by default? And why is “trim whitespace” required to stop “read_xml” introducing random blank lines? If it was all on one line with no linebreaks, no tabs, no spaces, I could understand, but this is very mysterious behaviour.

        • My hypothesis (but I did not look at the implementation) is that by default read_xml will also read the white spaces, and write_xml will put them back, but it does not know how the white spaces and the nested elements interleave, so it just puts all of them in one bunch.

  28. Domarius says:

    Ok I’m tearing my hair out – nowhere can I find out how to just change the existing value of a specific node. I need to iterate through several nodes and find the one with a specific value, and change that value.

    For my loop I have;
    for (auto const& v : pt.get_child(“ArcadeAttract”)) {

    Within the loop, the condition to find the node with “value” in it, is;
    if (v.first == “Category” && v.second.get(“Name”) == value)

    So this should successfully find the node with the matching “value”, accessible in “v.second”.

    But how on earth do I change this value?? All the examples say to use “node.put(child_name, value)”, but the line
    v.second.put(“Name”, value);
    gives an error that says no overload of “put” matches those parameters.

    • Domarius says:

      Ok I found out the reason. I didn’t need “const” up in the for loop declaration, that was preventing me from modifying the tree through the v variable.

      In my defence, the error message was fairly obfuscated in my opinion. The main part of the error message stated that “no overload of put matches those parameters” as I said in the last post. HOWEVER… this very long error message had an additional message tacked onto the end of it, in brackets (seemingly as an afterthought) “the object has type qualifiers that prevent a match.” THAT’S THE ACTUAL PROBLEM! The parameters were fine, it was the const modification that was preventing what I was doing. It should have led with that! Ugh. I get the logic behind the structure of the message, but geez.

  29. Pingback: Good XML parsers – HPC

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.