Stax XMLStreamReader check for the next event without moving ahead
Asked Answered
F

1

3

I have a large XML file that consists of many events. I would like to unmarshal them. As it's a large file, I would like to unmarshal them one by one so the whole file is not stored in memory. It works for some events but fails for some due to the fact that it's unable to map to a particular class as it's already in the next event.

Note: I am aware of the XMLEventReader but most of them have mentioned it as not very memory efficient so I am trying to use XMLStreamReader and accomplish this.

Following is the sample XML file that contains the events:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<extension>
    <extension>
        <c>
            <name>CName</name>
            <age>CAge</age>
        </c>
    </extension>
</extension>
<extension>
    <b>
        <name>BName</name>
        <age>BAge</age>
    </b>
</extension>
<a>
    <name>AName</name>
    <age>AAge</age>
</a>
<extension>
    <b>
        <name>BName</name>
        <age>BAge</age>
    </b>
</extension>

I have 3 classes corresponding to them which will be used for unmarshalling:

@XmlRootElement(name = "a")
@XmlAccessorType(XmlAccessType.FIELD)
@XmlType(name = "a", propOrder = {"name","age"})
public class A
{
    private String name;
    private String age;
    //Getter, Setter and other constructors
}

@XmlRootElement(name = "extension")
@XmlAccessorType(XmlAccessType.FIELD)
@XmlType(name = "extension", propOrder = {"name","age"})
public class B
{
    @XmlPath("b/name/text()")
    private String name;
    @XmlPath("b/age/text()")
    private String age;
    //Getter, Setter and other constructors
}

@XmlRootElement(name = "extension")
@XmlAccessorType(XmlAccessType.FIELD)
@XmlType(name = "extension", propOrder = {"name","age"})
public class C
{
    @XmlPath("extension/c/name/text()")
    private String name;
    @XmlPath("extension/c/age/text()")
    private String age;
    //Getter, Setter and other constructors
}

Following is my Main class which will be used for unmarshalling:

public class Main{
     
    private Unmarshaller unmarshaller = null;
    private JAXBContext jaxbContext = null;
  
    public void unmarshaller(InputStream xmlStream) throws IOException, XMLStreamException, JAXBException {
        final XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance();
        final XMLStreamReader streamReader = xmlInputFactory.createXMLStreamReader(xmlStream);
        
        //Navigate to next and start of the XML Elements
        streamReader.next();

        //Read Until the end of the file
        while (streamReader.hasNext()) {
            //Check if the element is "extension" if so its Class B or C
            if (streamReader.isStartElement() && streamReader.getLocalName().equalsIgnoreCase("extension")) {

                //Check if the next element also has "extension" if so its Class C
               //This is IMPORTANT step for mapping b/w Class B & C which is confusing me
                streamReader.next();

                if (streamReader.isStartElement() && streamReader.getLocalName().equalsIgnoreCase("extension")) {
                    //If there is 2 extension tag then its Class C
                    classSpecifier(C.class);
                    final C cInfo = unmarshaller.unmarshal(streamReader, C.class).getValue();
                    System.out.println(cInfo);
                }else{
                    //If there is no "extension" tag then its Class B
                    //THIS IS WHERE ITS FAILING: IF ITS NOT CLASS C THEN IT WOULD COME HERE BUT SINCE I HAVE 
                    //ALREADY MOVED TO NEXT ELEMENT TO CHECK IF ITS "extension" ITS UNABLE TO MAP THE WHOLE CLASS TO CLASS B
                    classSpecifier(B.class);
                    final B bInfo = unmarshaller.unmarshal(streamReader, B.class).getValue();
                    System.out.println(bInfo);
                }
            }else if(streamReader.isStartElement() && streamReader.getLocalName().equalsIgnoreCase("a")){
                //If there is no "extension" then its class A
                classSpecifier(A.class);
                final A aInfo = unmarshaller.unmarshal(streamReader, A.class).getValue();
                System.out.println(aInfo);
            }
        }
    }
    
    //Method to initialize the JAXBContext and Unmarshaller based on the incoming eventType
    private void classSpecifier(Class eventTypeClass) throws JAXBException {
        this.jaxbContext = JAXBContext.newInstance(eventTypeClass);
        unmarshaller = jaxbContext.createUnmarshaller();
    }
    
    public static void main(String args[]){
        try{
            InputStream xmlStream = Main.class.getClassLoader().getResourceAsStream("InputEPCISEvents.xml");
            unmarshaller(xmlStream);
        } catch (Exception e) {
            System.out.println(e);
            e.printStackTrace();
        }
    }
}

The problem I am facing is the differentiating between class B and C.

  1. I need to check if the incoming localName is extension.
  2. If it's extension then I need to check if the next element localName is also extension.
  3. If so then it's class C if not then class B.
  4. Since in Step-2 I have already moved to streamreader.next() and if the element is not extension then its unable to map it to class B as I have already moved to next() element and it does not have the whole class.

I am looking for some solutions where I can do the following:

  1. If the element in the 2nd verification is not extension then go back to the previous element then assign the whole class to class B.
  2. Assign the streamReader to tempreader when making a check so that you will be advancing in tempreader. But this also failing.

Is there a way to go back to the previous element in a stream or else how can I tackle this issue? I hope I was able to provide a complete explanation.

Fogbow answered 24/5, 2021 at 6:25 Comment(1)
I even tried the XMLEventReader but many of the articles have mentioned that it's not very memory efficient. So still looking for the answer any suggestion would be really helpful.Fogbow
P
1

"Going back" in a stream implies some kind of memory, so there is no point in sticking to the most memory-efficient tool.

XMLEventReader can handle this with ease:

public class Main {

    public static void main(String args[]) throws Exception {
        Unmarshaller aUnmarshaller = JAXBContext.newInstance(A.class).createUnmarshaller();
        Unmarshaller bUnmarshaller = JAXBContext.newInstance(B.class).createUnmarshaller();
        Unmarshaller cUnmarshaller = JAXBContext.newInstance(C.class).createUnmarshaller();
        try (InputStream input = Main.class.getResourceAsStream("InputEPCISEvents.xml")) {
            XMLEventReader reader = XMLInputFactory.newInstance().createXMLEventReader(input);
            while (reader.hasNext()) {
                XMLEvent event = reader.peek();
                if (event.isStartElement()) {
                    switch (event.asStartElement().getName().getLocalPart()) {
                        case "a" -> System.out.println(aUnmarshaller.unmarshal(reader));
                        case "b" -> System.out.println(bUnmarshaller.unmarshal(reader));
                        case "c" -> System.out.println(cUnmarshaller.unmarshal(reader));
                    }
                }
                reader.next();
            }
        }
    }

    @XmlAccessorType(XmlAccessType.FIELD)
    static class ABC {
        String name;
        String age;

        public String toString() {
            return getClass().getSimpleName() + "{name='" + name + "', age='" + age + "}";
        }
    }
    @XmlRootElement static class A extends ABC {}
    @XmlRootElement static class B extends ABC {}
    @XmlRootElement static class C extends ABC {}
}

Output:

C{name='CName', age='CAge}
B{name='BName', age='BAge}
A{name='AName', age='AAge}
B{name='BName', age='BAge}

By the way, your XML needs to be wrapped in a parent element as it contains more than one root element.

Perak answered 5/4, 2022 at 0:20 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.