SAX parsing: how to fetch child nodes
Asked Answered
G

4

7

I'm using SAX parsing in android. For below XML:

<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0">
    <channel>
        <title>Game Analysis</title>
        <item>
            <title>GTA</title>
            <description>ABC</description>
            <pubDate>Sat, 21 Feb 2012 05:18:23 GMT</pubDate>
            <enclosure type="audio/mpeg" url="http://URL.mp3" length="6670315"/>
        </item>
        <item>
            <title>CoD</title>
            <description>XYZ</description>
            <pubDate>Sat, 21 Feb 2011 05:18:23 GMT</pubDate>
            <enclosure type="audio/mpeg" url="http://URL.mp3" length="6670315"/>
        </item>
    </channel>
</rss>

I need to fetch the first occurance of <title> (just below ).

Then from every block I again need to extract <title> & <enclosure>.

I can fetch the first <title> using: public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException { if (qName.equals("title")) ... }

But, how should I fetch the tags inside <item> block?

Germanism answered 9/1, 2013 at 13:52 Comment(3)
take a look at this https://mcmap.net/q/1477974/-android-saxparser-parse-into-array-and-get-child-nodesKitts
thanks..but how should i differentiate between first title tag and the the one inside itemGermanism
Did you find your answer.if yes then please accept the answer,so that it would be useful for others.Cleopatra
P
14

Here is how I've done that with SAX.

I have modified a bite your XML file.

XML file

<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0">
    <channel>
        <title>Game Analysis</title>
        <item>
            <title>GTA</title>
            <description>ABC</description>
            <pubDate>Sat, 21 Feb 2012 05:18:23 GMT</pubDate>
            <enclosure type="audio/mpeg" url="http://URL.mp3/1" length="6670315"/>
        </item>
        <item>
            <title>CoD</title>
            <description>XYZ</description>
            <pubDate>Sat, 21 Feb 2011 05:45:10 GMT</pubDate>
            <enclosure type="audio/mpeg" url="http://URL.mp3/2" length="6670345"/>
        </item>
        <item>
            <title>AtV</title>
            <description>fgh</description>
            <pubDate>Sat, 21 Feb 2011 06:20:10 GMT</pubDate>
            <enclosure type="audio/mpeg" url="http://URL.mp3/3" length="6670364"/>
        </item>
    </channel>
    <channel>
        <title>Game Analysis 2</title>
        <item>
            <title>GTA 2</title>
            <description>ABC 2</description>
            <pubDate>Sat, 21 Feb 2012 04:18:23 GMT</pubDate>
            <enclosure type="audio/mpeg" url="http://URL.mp3/2/1" length="6670315"/>
        </item>
        <item>
            <title>CoD 2</title>
            <description>XYZ 2</description>
            <pubDate>Sat, 21 Feb 2011 04:45:10 GMT</pubDate>
            <enclosure type="audio/mpeg" url="http://URL.mp3/2/2" length="6670345"/>
        </item>
        <item>
            <title>AtV 2</title>
            <description>fgh</description>
            <pubDate>Sat, 21 Feb 2011 05:20:10 GMT</pubDate>
            <enclosure type="audio/mpeg" url="http://URL.mp3/2/3" length="6670364"/>
        </item>
    </channel>
</rss>

Entities

Channel

public class Channel {

    private String title;
    private ArrayList<Item> alItems;

    public Channel(){}

    public String getTitle() {
        return title;
    }

    public void setTitle(String title) {
        this.title = title;
    }

    public ArrayList<Item> getAlItems() {
        return alItems;
    }

    public void setAlItems(ArrayList<Item> alItems) {
        this.alItems = alItems;
    }


}

Enclosure

public class Enclosure {

    private String type;
    private URL url;
    private Integer length;


    public Enclosure(){}

    public String getType() {
        return type;
    }


    public void setType(String type) {
        this.type = type;
    }


    public URL getUrl() {
        return url;
    }


    public void setUrl(URL url) {
        this.url = url;
    }


    public Integer getLength() {
        return length;
    }


    public void setLength(Integer length) {
        this.length = length;
    }




}

Item

public class Item {

    private String title;
    private String description;
    private String pubDate;
    private Enclosure enclosure;

    public Item(){}

    public String getTitle() {
        return title;
    }

    public void setTitle(String title) {
        this.title = title;
    }

    public String getDescription() {
        return description;
    }

    public void setDescription(String description) {
        this.description = description;
    }

    public String getPubDate() {
        return pubDate;
    }

    public void setPubDate(String pubDate) {
        this.pubDate = pubDate;
    }

    public Enclosure getEnclosure() {
        return enclosure;
    }

    public void setEnclosure(Enclosure enclosure) {
        this.enclosure = enclosure;
    }



}

Handler

ChannelHandler

public class ChannelHandler extends DefaultHandler{

    private ArrayList<Channel> alChannels;
    private Channel channel;
    private String reading;
    private ArrayList<Item> alItems;
    private Item item;
    private Enclosure enclosure;

    public ChannelHandler(){
        super();
    }

    @Override
    public void startElement(String uri, String localName, String qName,
            Attributes attributes) throws SAXException {

        if(qName.equals("rss")){
                alChannels = new ArrayList<>();
        }
        else if(qName.equals("channel")){
            channel = new Channel();
        }
        else if(qName.equals("item")){
            item = new Item();
        }
        else if(qName.equals("enclosure")){

            enclosure = new Enclosure();
            enclosure.setType(attributes.getValue("type"));
            try {
                enclosure.setUrl(new URL(attributes.getValue("url")));
            } catch (MalformedURLException e) {
                // TODO Auto-generated catch block
                e.printStackTrace();
            }

            enclosure.setLength(Integer.parseInt(attributes.getValue("length")));

        }

    }

    @Override
    public void endElement(String uri, String localName, String qName)
            throws SAXException {

        if(qName.equals("channel")){
            channel.setAlItems(alItems);
            alChannels.add(channel);
            alItems = null;
        }
        if(qName.equals("title")){

            if(alItems == null){
                channel.setTitle(reading);
                alItems = new ArrayList<>();
            }
            else if(item != null) {
                item.setTitle(reading);
            }

        }
        else if(qName.equals("item")){

            if(alItems != null){
                alItems.add(item);
                item = null;
            }

        }
        else if(qName.equals("description")){
            item.setDescription(reading);
        }
        else if(qName.equals("pubDate")){
            item.setPubDate(reading);
        }
        else if(qName.equals("enclosure")){
            item.setEnclosure(enclosure);
        }

    }

    @Override
    public void characters(char[] ch, int start, int length)
            throws SAXException {
        reading = new String(ch, start, length);
    }

    public ArrayList<Channel> getAlChannels() {
        return alChannels;
    }


}

Manager

XMLManager

public final class XMLManager {


    public static ArrayList<Channel> getAlChannels(){
        ArrayList<Channel> alChannels = null;
        SAXParserFactory factory = SAXParserFactory.newInstance();
        try {
            SAXParser parser = factory.newSAXParser();
            File file = new File("D:\\Loic_Workspace\\TestSAX2\\res\\test.xml");
            ChannelHandler channelHandler = new ChannelHandler();
            parser.parse(file, channelHandler);
            alChannels = channelHandler.getAlChannels();
        } catch (ParserConfigurationException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        } catch (SAXException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        } catch (IOException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
        return alChannels;
    }

}

The main

MyMain

public class MyMain {

    /**
     * @param args
     */
    public static void main(String[] args) {

        Enclosure enclosure = null;
        for(Channel channel : XMLManager.getAlChannels()){
            System.out.println("Channel title : "+channel.getTitle());
            System.out.println("------------------------");
            for(Item i:channel.getAlItems()){
                System.out.println(i.getTitle());
                System.out.println(i.getPubDate());
                System.out.println("Enclosure : ");
                enclosure = i.getEnclosure();
                System.out.println(enclosure.getType());
                System.out.println(enclosure.getUrl());
                System.out.println(enclosure.getLength());
                System.out.println("------------------------");
            }
        }




    }

}

Output in the console

Channel title : Game Analysis
------------------------
GTA
Sat, 21 Feb 2012 05:18:23 GMT
Enclosure : 
audio/mpeg
http://URL.mp3/1
6670315
------------------------
CoD
Sat, 21 Feb 2011 05:45:10 GMT
Enclosure : 
audio/mpeg
http://URL.mp3/2
6670345
------------------------
AtV
Sat, 21 Feb 2011 06:20:10 GMT
Enclosure : 
audio/mpeg
http://URL.mp3/3
6670364
------------------------
Channel title : Game Analysis 2
------------------------
GTA 2
Sat, 21 Feb 2012 04:18:23 GMT
Enclosure : 
audio/mpeg
http://URL.mp3/2/1
6670315
------------------------
CoD 2
Sat, 21 Feb 2011 04:45:10 GMT
Enclosure : 
audio/mpeg
http://URL.mp3/2/2
6670345
------------------------
AtV 2
Sat, 21 Feb 2011 05:20:10 GMT
Enclosure : 
audio/mpeg
http://URL.mp3/2/3
6670364
------------------------

So it works ;)

Psychology answered 6/2, 2013 at 22:48 Comment(0)
K
1

SAX is the wrong tool for this job. Your requirements would be easily solved using DOM and XPath.

Kellsie answered 9/1, 2013 at 14:8 Comment(8)
DOM is actually a bit heavy for my requirement. And SAX comes within androidGermanism
@raul8 - so does DOM. And I have no idea what "a bit heavy" means. If you can reduce the amount of code you write, that seems lighter to me. But it's your project, so do whatever makes you happy.Kellsie
dom generates a ton of objects at once and is usually a bad choice on any mobile platform. Depending on the XML size, it can even kill your heap space - you just can't parse a 100mb XML file into a dom model on android.Fruitage
@Fruitage - I appreciate that you explained your downvote. However, your reasoning is filled with FUD. First off, I would question any design that attempts to parse any 100 Mb file on a mobile device. Have you actually had to do that? Second, while I agree that DOM creates a lot of objects, have you ever considered just how many object SAX creates? The underlying data is the same in either form; a DOM, in fact, is built using the SAX parser. It simply holds onto all the objects, rather than immediately giving them to the garbage collector.Kellsie
yes i have done that a million times. and you need to do it alot more than you migh think. maybe not 100 mb, but youre really really quickly at 20-30 mb and that's already too much. SAX does not create unnecessary objects at all. It only creates the objects that you will need later on, because you yourself create them. If you blow up the heap space using SAX you need to redo your design or use a database inbetween. if you just use a SAX parser and feed it the XML, you won't be allocating a lot of objects per se. compare that to DOM.Fruitage
@Fruitage - I don't think you understand how a SAX parser works. When your startElement() method gets called, it is passed 4objects that are created by the SAX parser. And that's assuming that you don't actually have any attributes on the element; if you do, that's an additional 3 objects per attribute (not counting "hidden" objects such as the array behind every String). If you don't use those objects, they're immediately eligible for garbage collection. The difference with DOM is that DOM holds onto the objects, and puts them in a linked list.Kellsie
Obviously it does create those objects. But that's it - once they fall out of scope, those objects are gone. I did not say SAX does not create any objects, I said it does not create unesseasary ones.Most of the time when you parse XML, you will not want all the content but only parts of it - and you cannot skip that with DOM. Check what the OP wanted: only the title. And it's an RSS feed - nobody stops the RSS author from serving 20 mb of RSS.Fruitage
Parsifal is correct. Ii can be done a lot more efficient with XPath. It's like you buy a train when you want to take a trip.Theorbo
F
0

You use a stack (or similar) and remember whatever you need. SAX is event based and therefore you have to manage information about where you are on your own. Consider something like this:

public Parser extends ....
    private Item item;
    private StringBuffer buffer;

    startElement(String uri,)...{
        buffer = new StringBuffer();
    }
    characters(...) {
        buffer.append(...); // sorry, coding by memory directly on SO, can't remember correct syntax.
    }
    endElement(String uri, String qName...) {
        if(qName.equals("item") {
            handleOldItem();
            item = new Item();
        } else if(qname.equals("title") {
            item.setTitle(buffer.toString());
        }
    }
}
Fruitage answered 9/1, 2013 at 14:5 Comment(0)
C
0

In place SAX parser use Dom parser and Below is your complete answer:-

DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory
                .newInstance();
        documentBuilderFactory.setCoalescing(true);
        DocumentBuilder documentBuilder = documentBuilderFactory
                .newDocumentBuilder();
        Document document = documentBuilder.parse(new InputSource(is));
            title =   document.getElementsByTagName("title").item(0).getFirstChild().getNodeValue().trim();
        itemList = document.getElementsByTagName("item");
        for (int i = 0; i < itemList .getLength(); i++) {
            if(itemModel == null){
                itemModel = new ItemModel();
            }
            if(arrListItemModel==null){
                arrListItemModel= new ArrayList<ItemModel>();
            }
            itemItem = (Element)itemList .item(i);
            itemModel.setTitle(itemItem                                                                                                                          .getElementsByTagName("title").item(0).getFirstChild().getNodeValue().trim());
            itemModel.setDescription(itemItem .getElementsByTagName("description").item(0).getFirstChild().getNodeValue().trim());
            itemModel.setPubDate(itemItem .getElementsByTagName("pubDate").item(0).getFirstChild().getNodeValue().trim());
            itemModel.setEnclosure(itemItem .getElementsByTagName("enclosure ").item(0).getFirstChild().getNodeValue().trim());
            arrListItemModel.add(tippsModel);
            itemModel =null;
        }
Cana answered 9/1, 2013 at 14:29 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.