How to parse an RSS feed with XmlPullParser?
Asked Answered
D

1

10

I would like to parse a RSS feed. My question is how I can parse all tags between the <item>and </item> tags.

Given this very simple XML:

<?xml version="1.0" ?>
<rss version="2.0">
<channel>
  <title>MyRSSPage</title>
  <link>http://www.example.com</link>
  <item>
  <link>www.example.com/example1</link>
  <title>Example title 1</title>
  </item>
  <item>
  <link>www.example.com/example2</link>
  <title>Example title 2</title>
  </item>
</channel>
</rss>

I would like to parse just the stuff between the <item>...</item> tags.

            List<RssMessage> messages = new ArrayList<RssMessage>();

            // parser is a XmlPullParser instance
            while(parser.next() != XmlPullParser.END_DOCUMENT) {
                if (parser.getEventType() != XmlPullParser.START_TAG) {
                    continue;
                }
            String name = parser.getName();
            // START OF HEADER
            if(name.equals("title")) {
                title = parser.nextText();
            }
            else if(name.equals("link")) {
                link = parser.nextText();
            }
            else if(name.equals("description")) {
                description = parser.nextText();
            }
            else if(name.equals("language")) {
                language = parser.nextText();
            }
            else if(name.equals("copyright")) {
                copyright = parser.nextText();
            }
            else if(name.equals("pubDate")) {
                pubdate = parser.nextText();
            }
            // END OF HEADER

            else if(name.equals("item")) {
                RssMessage rssMessage = processItem(parser);
                messages.add(rssMessage);
            }
        }

In the below method I would like to just parse the tags within the <item>...</item>tags. How do I construct a loop that just goes through the item between <item> and </item>?

EDIT
This is almost working. But sometimes not all elements are initiated even if the corresponding element in the RSS xml DO exist! Is something wrong with the below code?

private RssMessage processItem(XmlPullParser parser) throws IOException, XmlPullParserException {
        RssMessage rssMessage = new RssMessage();
    parser.require(XmlPullParser.START_TAG, ns, "item");
    while (parser.next() != XmlPullParser.END_TAG) {
        if (parser.getEventType() != XmlPullParser.START_TAG) {
            continue;
        }
        String name = parser.getName();
        if(name.equals("link")) {
            rssMessage.setLink(parser.nextText());
        }
        else if(name.equals("guid")) {
            rssMessage.setGuid(parser.nextText());
        }
        else if(name.equals("category")) {
            rssMessage.setCategory(parser.nextText());
        }
        else if(name.equals("title")) {
            rssMessage.setTitle(parser.nextText());
        }
        else if(name.equals("pubDate")) {
            rssMessage.setPubDate(parser.nextText());
        }
    }
    return rssMessage;
    }
Donau answered 2/7, 2013 at 19:25 Comment(2)
what's wrong with the code? any problem.Coalfish
Nothing is wrong except that I don´t know how to parse the tags just between <item>and </item>.Donau
C
12

Try the below.

try {
    XmlPullParserFactory factory = XmlPullParserFactory.newInstance();
    factory.setNamespaceAware(false);
    XmlPullParser xpp = factory.newPullParser();
    xpp.setInput(url.openConnection().getInputStream(), "UTF_8"); 
    //xpp.setInput(getInputStream(url), "UTF-8");

    boolean insideItem = false;

    // Returns the type of current event: START_TAG, END_TAG, etc..
    int eventType = xpp.getEventType();
    while (eventType != XmlPullParser.END_DOCUMENT) {
        if (eventType == XmlPullParser.START_TAG) {

            if (xpp.getName().equalsIgnoreCase("item")) {
                insideItem = true;
            } 
            else if(xpp.getName().equalsIgnoreCase("title")) 
            {

            }
        }
        eventType = xpp.next(); //move to next element
    }

} catch (MalformedURLException e) {
    e.printStackTrace();
} catch (XmlPullParserException e) {
    e.printStackTrace();
} catch (IOException e) {
    e.printStackTrace();
}

Edit:

XmlPullParserFactory factory = XmlPullParserFactory.newInstance();
factory.setNamespaceAware(false);
XmlPullParser xpp = factory.newPullParser();
xpp.setInput(open,null);
// xpp.setInput(getInputStream(url), "UTF-8");

boolean insideItem = false;

// Returns the type of current event: START_TAG, END_TAG, etc..
int eventType = xpp.getEventType();
while (eventType != XmlPullParser.END_DOCUMENT) {
    if (eventType == XmlPullParser.START_TAG) {

        if (xpp.getName().equalsIgnoreCase("item")) {
            insideItem = true;
        } else if (xpp.getName().equalsIgnoreCase("title")) {
            if (insideItem)
                Log.i("....",xpp.nextText()); // extract the headline
        } else if (xpp.getName().equalsIgnoreCase("link")) {
            if (insideItem)
                Log.i("....",xpp.nextText());  // extract the link of article
        }
    } else if (eventType == XmlPullParser.END_TAG && xpp.getName().equalsIgnoreCase("item")) {
        insideItem = false;
    }

    eventType = xpp.next(); // move to next element
}

Output

www.example.com/example1
Example title 1
www.example.com/example2
Example title 2
Coalfish answered 2/7, 2013 at 19:43 Comment(9)
Have a look at my edit in my post above. My question is rather about how I may parse the elements between the <item> and </item> tags.Donau
try the edit. it should work. developer.android.com/reference/org/xmlpull/v1/…Coalfish
@Coalfish can you send me any lin for between the <item> and </item> tags examplePhotoperiod
@Coalfish can you explain little bit more example for thisPhotoperiod
@Photoperiod where do you need explanation. Also you can copy the xml to assests fodler and use the same that op has in the questionCoalfish
@Coalfish once see this #24734812Photoperiod
@Photoperiod and the problem is??Coalfish
@Photoperiod answered your post. Rest is upto you. Good luck. I could not reply fast.Coalfish
For something like a LINK that comes w/out a closing tag and is in the HREF attribute, here's how to get it: String linkUrl = xpp.getAttributeValue(null, HREF);Cagle

© 2022 - 2024 — McMap. All rights reserved.