How to get an attribute from an XMLReader
Asked Answered
M

5

14

I have some HTML that I'm converting to a Spanned using Html.fromHtml(...), and I have a custom tag that I'm using in it:

<customtag id="1234">

So I've implemented a TagHandler to handle this custom tag, like so:

public void handleTag( boolean opening, String tag, Editable output, XMLReader xmlReader ) {

    if ( tag.equalsIgnoreCase( "customtag" ) ) {

        String id = xmlReader.getProperty( "id" ).toString();
    }
}

In this case I get a SAX exception, as I believe the "id" field is actually an attribute, not a property. However, there isn't a getAttribute() method for XMLReader. So my question is, how do I get the value of the "id" field using this XMLReader? Thanks.

Maybellmaybelle answered 5/8, 2011 at 6:22 Comment(8)
Where is TagHandler? The usual way to do SAX2 is to use ContentHandlers, no?Ushas
TagHandler is used when converting HTML text to Spannable text via Html.fromHtml(String, ImageGetter, TagHandler). It's for handling unknown tags (tags not recognized by TagSoup).Maybellmaybelle
I see. I just tagged the question with TagSoup so those familiar with this parser can find the question. I do know that in the regular SAX2 parser in the standard Java libraries you just setup ContentHandlers, not TagHandlers, and the startElement callback has the attributes already present.Ushas
I had the same problem and when I looked at the Android source code, I saw that the attributes are intentionally not passed. So I replace tags with attributes with other tags which have a specific name. Like <customtag1234> in your case.Gertiegertrud
@rekire No, I didn't. I ended up doing what vorrtex suggested.Maybellmaybelle
I found a solution with reflecting the xmlReader. Inside is a theElement were I found the attributes. I can post the code next week.Jandy
@Jandy Sure, feel free. I'll try it out once you post it and accept it if it works.Maybellmaybelle
Does anyone have experience with replacements like github.com/NightWhistler/HtmlSpanner or github.com/commonsguy/cwac-richedit?Actinozoan
J
9

Here is my code to get the private attributes of the xmlReader by reflection:

Field elementField = xmlReader.getClass().getDeclaredField("theNewElement");
elementField.setAccessible(true);
Object element = elementField.get(xmlReader);
Field attsField = element.getClass().getDeclaredField("theAtts");
attsField.setAccessible(true);
Object atts = attsField.get(element);
Field dataField = atts.getClass().getDeclaredField("data");
dataField.setAccessible(true);
String[] data = (String[])dataField.get(atts);
Field lengthField = atts.getClass().getDeclaredField("length");
lengthField.setAccessible(true);
int len = (Integer)lengthField.get(atts);

String myAttributeA = null;
String myAttributeB = null;

for(int i = 0; i < len; i++) {
    if("attrA".equals(data[i * 5 + 1])) {
        myAttributeA = data[i * 5 + 4];
    } else if("attrB".equals(data[i * 5 + 1])) {
        myAttributeB = data[i * 5 + 4];
    }
}

Note you could put the values into a map but for my usage that's too much overhead.

Jandy answered 4/3, 2013 at 7:8 Comment(3)
Hmmm, this keeps throwing java.lang.NoSuchFieldException - what am I doing wrong?Nanceynanchang
@Nanceynanchang there are three possibilities: The field does not exists anymore (might be an incompatible version); in general you are accessing a private field and you forgot to call setAccessible(true) or it is part of a base class in the last case you need to inspect its superclass.Jandy
I got it working - trick was not to call as the first thing inside the handleTag method :) Thanx for help though - always helps when someone pushes you through the rough patches of life...Nanceynanchang
V
11

It is possible to use XmlReader provided by TagHandler and get access to tag attribute values without reflection, but that method is even less straightforward than reflection. The trick is to replace ContentHandler used by XmlReader with custom object. Replacing ContentHandler can only be done in the call to handleTag(). That presents a problem getting attribute values for the first tag, which can be solved by adding a custom tag at the start of html.

import android.text.Editable;
import android.text.Html;
import android.text.Spanned;

import org.xml.sax.Attributes;
import org.xml.sax.ContentHandler;
import org.xml.sax.Locator;
import org.xml.sax.SAXException;
import org.xml.sax.XMLReader;

import java.util.ArrayDeque;

public class HtmlParser implements Html.TagHandler, ContentHandler
{
    public interface TagHandler
    {
        boolean handleTag(boolean opening, String tag, Editable output, Attributes attributes);
    }

    public static Spanned buildSpannedText(String html, TagHandler handler)
    {
        // add a tag at the start that is not handled by default,
        // allowing custom tag handler to replace xmlReader contentHandler
        return Html.fromHtml("<inject/>" + html, null, new HtmlParser(handler));
    }

    public static String getValue(Attributes attributes, String name)
    {
        for (int i = 0, n = attributes.getLength(); i < n; i++)
        {
            if (name.equals(attributes.getLocalName(i)))
                return attributes.getValue(i);
        }
        return null;
    }

    private final TagHandler handler;
    private ContentHandler wrapped;
    private Editable text;
    private ArrayDeque<Boolean> tagStatus = new ArrayDeque<>();

    private HtmlParser(TagHandler handler)
    {
        this.handler = handler;
    }

    @Override
    public void handleTag(boolean opening, String tag, Editable output, XMLReader xmlReader)
    {
        if (wrapped == null)
        {
            // record result object
            text = output;

            // record current content handler
            wrapped = xmlReader.getContentHandler();

            // replace content handler with our own that forwards to calls to original when needed
            xmlReader.setContentHandler(this);

            // handle endElement() callback for <inject/> tag
            tagStatus.addLast(Boolean.FALSE);
        }
    }

    @Override
    public void startElement(String uri, String localName, String qName, Attributes attributes)
            throws SAXException
    {
        boolean isHandled = handler.handleTag(true, localName, text, attributes);
        tagStatus.addLast(isHandled);
        if (!isHandled)
            wrapped.startElement(uri, localName, qName, attributes);
    }

    @Override
    public void endElement(String uri, String localName, String qName) throws SAXException
    {
        if (!tagStatus.removeLast())
            wrapped.endElement(uri, localName, qName);
        handler.handleTag(false, localName, text, null);
    }

    @Override
    public void setDocumentLocator(Locator locator)
    {
        wrapped.setDocumentLocator(locator);
    }

    @Override
    public void startDocument() throws SAXException
    {
        wrapped.startDocument();
    }

    @Override
    public void endDocument() throws SAXException
    {
        wrapped.endDocument();
    }

    @Override
    public void startPrefixMapping(String prefix, String uri) throws SAXException
    {
        wrapped.startPrefixMapping(prefix, uri);
    }

    @Override
    public void endPrefixMapping(String prefix) throws SAXException
    {
        wrapped.endPrefixMapping(prefix);
    }

    @Override
    public void characters(char[] ch, int start, int length) throws SAXException
    {
        wrapped.characters(ch, start, length);
    }

    @Override
    public void ignorableWhitespace(char[] ch, int start, int length) throws SAXException
    {
        wrapped.ignorableWhitespace(ch, start, length);
    }

    @Override
    public void processingInstruction(String target, String data) throws SAXException
    {
        wrapped.processingInstruction(target, data);
    }

    @Override
    public void skippedEntity(String name) throws SAXException
    {
        wrapped.skippedEntity(name);
    }
}

With this class reading attributes is easy:

    HtmlParser.buildSpannedText("<x id=1 value=a>test<x id=2 value=b>", new HtmlParser.TagHandler()
    {
        @Override
        public boolean handleTag(boolean opening, String tag, Editable output, Attributes attributes)
        {
            if (opening && tag.equals("x"))
            {
                String id = HtmlParser.getValue(attributes, "id");
                String value = HtmlParser.getValue(attributes, "value");
            }
            return false;
        }
    });

This approach has the advantage that it allows to disable processing of some tags while using default processing for others, e.g. you can make sure that ImageSpan objects are not created:

    Spanned result = HtmlParser.buildSpannedText("<b><img src=nothing>test</b><img src=zilch>",
            new HtmlParser.TagHandler()
            {
                @Override
                public boolean handleTag(boolean opening, String tag, Editable output, Attributes attributes)
                {
                    // return true here to indicate that this tag was handled and
                    // should not be processed further
                    return tag.equals("img");
                }
            });
Virescent answered 10/4, 2016 at 9:39 Comment(3)
I just tried this and the handler part works really good, however the built-in parser is somehow messed up, <tag attr="a & b">text</tag> is parsed as attributes = {attr="a", _="_", b="b"}. Using &amp; is similarly bad.Addressee
@Addressee for me String testStr = "<tag attr=\"a & b\">text</tag>"; gets parsed correctly. I get the problem you described by parsing String testStr = "<tag attr=a & b>text</tag>";, which is not really a valid html.Virescent
Hmm, you're using hard-coded Java, I read the value from resources: <string name="..."><![CDATA[ ... <tag...> ... ]]></string>. Maybe something got lost in transmission while the framework read the XML. I worked around it by using Java identifiers (e.g. enum constant name) instead of plain text in attributes, so they're only a "word" long.Addressee
J
9

Here is my code to get the private attributes of the xmlReader by reflection:

Field elementField = xmlReader.getClass().getDeclaredField("theNewElement");
elementField.setAccessible(true);
Object element = elementField.get(xmlReader);
Field attsField = element.getClass().getDeclaredField("theAtts");
attsField.setAccessible(true);
Object atts = attsField.get(element);
Field dataField = atts.getClass().getDeclaredField("data");
dataField.setAccessible(true);
String[] data = (String[])dataField.get(atts);
Field lengthField = atts.getClass().getDeclaredField("length");
lengthField.setAccessible(true);
int len = (Integer)lengthField.get(atts);

String myAttributeA = null;
String myAttributeB = null;

for(int i = 0; i < len; i++) {
    if("attrA".equals(data[i * 5 + 1])) {
        myAttributeA = data[i * 5 + 4];
    } else if("attrB".equals(data[i * 5 + 1])) {
        myAttributeB = data[i * 5 + 4];
    }
}

Note you could put the values into a map but for my usage that's too much overhead.

Jandy answered 4/3, 2013 at 7:8 Comment(3)
Hmmm, this keeps throwing java.lang.NoSuchFieldException - what am I doing wrong?Nanceynanchang
@Nanceynanchang there are three possibilities: The field does not exists anymore (might be an incompatible version); in general you are accessing a private field and you forgot to call setAccessible(true) or it is part of a base class in the last case you need to inspect its superclass.Jandy
I got it working - trick was not to call as the first thing inside the handleTag method :) Thanx for help though - always helps when someone pushes you through the rough patches of life...Nanceynanchang
N
9

Based on the answer by rekire I made this slightly more robust solution that will handle any tag.

private TagHandler tagHandler = new TagHandler() {
    final HashMap<String, String> attributes = new HashMap<String, String>();

    private void processAttributes(final XMLReader xmlReader) {
        try {
            Field elementField = xmlReader.getClass().getDeclaredField("theNewElement");
            elementField.setAccessible(true);
            Object element = elementField.get(xmlReader);
            Field attsField = element.getClass().getDeclaredField("theAtts");
            attsField.setAccessible(true);
            Object atts = attsField.get(element);
            Field dataField = atts.getClass().getDeclaredField("data");
            dataField.setAccessible(true);
            String[] data = (String[])dataField.get(atts);
            Field lengthField = atts.getClass().getDeclaredField("length");
            lengthField.setAccessible(true);
            int len = (Integer)lengthField.get(atts);

            /**
             * MSH: Look for supported attributes and add to hash map.
             * This is as tight as things can get :)
             * The data index is "just" where the keys and values are stored. 
             */
            for(int i = 0; i < len; i++)
                attributes.put(data[i * 5 + 1], data[i * 5 + 4]);
        }
        catch (Exception e) {
            Log.d(TAG, "Exception: " + e);
        }
    }
...

And inside handleTag do:

    @Override
    public void handleTag(boolean opening, String tag, Editable output, XMLReader xmlReader) {

        processAttributes(xmlReader);
...

And then the attributes will be accessible as so:

attributes.get("my attribute name");

Nanceynanchang answered 2/7, 2014 at 15:8 Comment(0)
A
1

There's an alternative to the other solutions, that doesn't allow you to use custom tags, but has the same effect:

<string name="foobar">blah <annotation customTag="1234">inside blah</annotation> more blah</string>

Then read it like this:

CharSequence annotatedText = context.getText(R.string.foobar);
// wrap, because getText returns a SpannedString, which is not mutable
CharSequence processedText = replaceCustomTags(new SpannableStringBuilder(annotatedText));

public static <T extends Spannable> T replaceCustomTags(T text) {
    Annotation[] annotations = text.getSpans(0, text.length(), Annotation.class);
    for (Annotation a : annotations) {
        String attrName = a.getKey();
        if ("customTag".equals(attrName)) {
            String attrValue = a.getValue();
            int contentStart = text.getSpanStart(a);
            int contentEnd = text.getSpanEnd(a);
            int contentFlags = text.getSpanFlags(a);
            Object newFormat1 = new StyleSpan(Typeface.BOLD);
            Object newFormat2 = new ForegroundColorSpan(Color.RED);
            text.setSpan(newFormat1, contentStart, contentEnd, contentFlags);
            text.setSpan(newFormat2, contentStart, contentEnd, contentFlags);
            text.removeSpan(a);
        }
    }
    return text;
}

Depending on what you wanted to do with your custom tags, the above may help you. If you just want to read them, you don't need a SpannableStringBuilder, just cast getText to Spanned interface to investigate.

Note that Annotation representing <annotation foo="bar">...</annotation> is an Android built-in since API level 1! It's one of those hidden gems again. The It has the limitation of one attribute per <annotation> tag, but nothing prevents you from nesting multiple annotations to achieve multiple attributes:

<string name="gold_admin_user"><annotation user="admin"><annotation rank="gold">$$username$$</annotation></annotation></string>

If you use the Editable interface instead of Spannable you can also modify the content around each annotation. For example changing the above code:

String attrValue = a.getValue();
text.insert(text.getSpanStart(a), attrValue);
text.insert(text.getSpanStart(a) + attrValue.length(), " ");
int contentStart = text.getSpanStart(a);

will result as if you had this in the XML:

blah <b><font color="#ff0000">1234 inside blah</font></b> more blah

One caveat to look out for is when you make modifications that affect the length of the text, the spans move around. Make sure you read the span start/end indices at the correct times, best if you inline them to the method calls.

Editable also allows you to do simple search and replace substitution:

index = TextUtils.indexOf(text, needle); // for example $$username$$ above
text.replace(index, index + needle.length(), replacement);
Addressee answered 29/7, 2016 at 22:29 Comment(0)
N
0

If all you need is just one attribute the suggestion by vorrtex is actually pretty solid. To give you an example of just how simple it would be to handle have a look here:

<xml>Click on <user1>Johnni<user1> or <user2>Jenny<user2> to see...</<xml>

And in your custom TagHandler you don't use equals but indexOf

final static String USER = "user";
if(tag.indexOf(USER) == 0) {
    // Extract tag postfix.
    String postfix = tag.substring(USER.length());
    Log.d(TAG, "postfix: " + postfix);
}

And you can then pass the postfix value in your onClick view parameter as a tag to keep it generic.

Nanceynanchang answered 2/7, 2014 at 11:35 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.