Jasper Reports OutOfMemoryError on export
Asked Answered
B

3

5

I have written a web app for managing and running Jasper reports. Lately I've been working with some reports that generate extremely large (1500+ page) outputs, and attempting to resolve the resultant memory issues. I have discovered the JRFileVirtualizer, which has allowed me to run the report successfully with a very limited memory footprint. However, one of the features of my application is that it stores output files from previously run reports, and allows them to be exported to various formats (PDF, CSV, etc.). Therefore, I find myself in the situation of having a 500+MB .jrprint file and wanting to export it to, for example, CSV on demand. Here is some simplified example code:

JRCsvExporter exporter = new JRCsvExporter();
exporter.setParameter(JRExporterParameter.INPUT_FILE_NAME, jrprintPath);
exporter.setParameter(JRExporterParameter.OUTPUT_STREAM, outputStream);
exporter.exportReport();

Unfortunately, when I attempt this on the large file I mentioned, I get an OutOfMemoryError:

Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
    at java.io.ObjectInputStream$HandleTable.grow(ObjectInputStream.java:3421)
    at java.io.ObjectInputStream$HandleTable.assign(ObjectInputStream.java:3227)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1744)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329)
    at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
    at java.util.ArrayList.readObject(ArrayList.java:593)
    at sun.reflect.GeneratedMethodAccessor184.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:974)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1849)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329)
    at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
    at net.sf.jasperreports.engine.base.JRVirtualPrintPage.readObject(JRVirtualPrintPage.java:423)
    ...

From browsing some of the Jasper internals, it looks like no matter how I attempt to set up this export (I have also tried loading and setting the JASPER_PRINT parameter directly), there will ultimately be a call to JRLoader.loadObject(...) which will attempt to load my entire 500MB report into memory (see net.sf.jasperreports.engine.JRAbstractExporter.setInput()).

My question is, is there a way around this that doesn't involve just throwing memory at the problem? 500MB is doable, but it doesn't leave my application very future-proof, and the JRVirtualizer solution for report execution leaves me hoping there will be something similar for export. I am willing to get my hands dirty and extend some of the Jasper internal classes, but the ideal solution would be one provided by Jasper itself, for obvious reasons.

Bergin answered 26/9, 2011 at 19:49 Comment(3)
1500+ page reports, that is so large it is useless. Maybe consider breaking these reports up.Avilla
Unfortunately it's a case of using a general tool for many different purposes. This report (in CSV form) is imported by script into another database, not browsed by a human being.Bergin
I have filed a feature request with JasperSoft, as it looks like project budget constraints are going to prevent me from pursuing more elaborate solutions like the one suggested by gpeche below. jasperforge.org/projects/jasperreports/tracker/view.php?id=5478Bergin
B
5

Since posting this question, I have also filed a feature request with JasperSoft. As a follow-up, I was pointed to the JRVirtualizationHelper.setThreadVirtualizer method. This method allows you to set a JRVirtualizer associated with the current thread, which will be used during JasperPrint deserialization.

I have tested this in my project with satisfactory results. It seems the feature I was hoping existed does indeed exist, although its visibility in the API could potentially be improved.

Code sample:

JRVirtualizer virtualizer = new JRSwapFileVirtualizer(1000, new JRSwapFile(reportFilePath, 2048, 1024), true);
JRVirtualizationHelper.setThreadVirtualizer(virtualizer);
Bergin answered 28/9, 2011 at 14:52 Comment(1)
Do you need to call clearThreadVirtualizer(), or is it automatically cleaned up somewhere?Victoriavictorian
I
2

I think your problem is that a .jrprint is a serialized Java object that you must deserialize completely. You need to break it somehow into small files and then concatenate the outputs at export time.

My proposal is a bit involved but I think it might work, at least for some cases:

  1. Fill your report using a JRVirtualizer. Use the methods that return a JasperPrint instance, to avoid dumping everything to a huge .jrprint.
  2. Do an internal export using a JRXmlExporter. The trick would be to use the appropiate JRExportParameters to tell Jasper to export each page separately (you can use a ZipOutputStream as a container to avoid directories with lots of files).
  3. When you want to do your real export, use a JASPER_PRINT_LIST. It is important that the list implementation is lazy and creates JasperPrint instances one by one using JRPrintXmlLoader, so you do not need to load the whole thing at once.

Anyway, you should inspect Jasper source code to check if this approach is doable.

Iceboat answered 26/9, 2011 at 22:19 Comment(2)
Thanks for the suggestion. I think you hit the nail on the head with the deserialization. What I'd love to see from Jasper is an option to provide a JRVirtualizer during deserialization/export in order to do this automatically. I'm going to do some experimentation along this line and see what I can come up with.Bergin
@Bergin A JRVirtualizer during deserialization is a lot work for Jasper: basically it implies serializing .jrprints using Externalizable or a custom mechanism instead of Java Serializable which is basically automatic. While virtualizing deserialization looks like the right option, I suspect it would be easier for them to provide builtin support for some variation of my approach, providing automatic export page by page into a zip and a lazy list implementation.Iceboat
M
0

Thank you for your question and your own answer.

But I have a further question on your solution:

You said you use the method JRVirtualizationHelper.setThreadVirtualizer to set the JRSwapFileVirtualizer instance associated with current thread. But since your requirement is to export some previously generated reports into PDF/CSV files, I think the GENERATE and EXPORT actions are run in two separated threads, because these two actions are probably generated by two separated user clicks.

So, why can you set a single JRSwapFileVirtualizer instance for the two threads? You are using a single server JVM?

Malachi answered 11/2, 2022 at 4:1 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.