Performance issue with MultiResourcePartitioner in Spring Batch
Asked Answered
R

0

1

I have a spring batch project that reads a huge zip file containing more than 100.000 xml files.

I am using MultiResourcePartitioner, and I have a Memory issue and my batch fails with

java.lang.OutOfMemoryError: GC overhead limit exceeded.

It seems like if all the xml files are loaded in memory and not garbaged after processing.

Is there a performant way to do this ?

Thanks.

Race answered 5/8, 2016 at 15:49 Comment(5)
What memory settings are you currently using?Atthia
I am getting this error when using these settings : -Xms512m -Xmx1024m. When I set -Xms1024m -Xmx4096m I don't get the error but the Heap is using 2Gb, it seems to be too much for 200.000 xml files of 4Ko each.Race
With 200,000 files, do you really need/want one partition per file? You may want to consider writing your own Partitioner that groups files together into chunks.Atthia
I process each file individually, each xml is marshalled, then processed, then written to an XML file. Can you explain further what you mean by partioner that groups files into chunks ?Race
The MultiResourcePartitioner creates one partition (and therefore one ExecutionContext and one StepExecution per file. With 200,000 files, you may wan to group them together so that you have less partitions.Atthia

© 2022 - 2024 — McMap. All rights reserved.