How do you make your Java application memory efficient?
Asked Answered
I

12

22

How do you optimize the heap size usage of an application that has a lot (millions) of long-lived objects? (big cache, loading lots of records from a db)

  • Use the right data type
    • Avoid java.lang.String to represent other data types
  • Avoid duplicated objects
    • Use enums if the values are known in advance
    • Use object pools
    • String.intern() (good idea?)
  • Load/keep only the objects you need

I am looking for general programming or Java specific answers. No funky compiler switch.

Edit:

Optimize the memory representation of a POJO that can appear millions of times in the heap.

Use cases

  • Load a huge csv file in memory (converted into POJOs)
  • Use hibernate to retrieve million of records from a database

Resume of answers:

  • Use flyweight pattern
  • Copy on write
  • Instead of loading 10M objects with 3 properties, is it more efficient to have 3 arrays (or other data structure) of size 10M? (Could be a pain to manipulate data but if you are really short on memory...)
Interjoin answered 25/4, 2009 at 15:26 Comment(3)
by writing your program in assembly... :)Frunze
I doubt that assembler would help - development time would be significantly longer and would not be cross platform. ;)Predictor
@Predictor I suspect Desmond was making a joke.Wilburnwilburt
T
18

You don't say what sort of objects you're looking to store, so it's a little difficult to offer detailed advice. However some (not exclusive) approaches, in no particular order, are:

  • Use a flyweight pattern wherever possible.
  • Caching to disc. There are numerous cache solutions for Java.
  • There is some debate as to whether String.intern is a good idea. See here for a question re. String.intern(), and the amount of debate around its suitability.
  • Make use of soft or weak references to store data that you can recreate/reload on demand. See here for how to use soft references with caching techniques.

Knowing more about the internals and lifetime of the objects you're storing would result in a more detailed answer.

Tarrant answered 25/4, 2009 at 15:34 Comment(0)
B
20

I suggest you use a memory profiler, see where the memory is being consumed and optimise that. Without quantitative information you could end up changing thing which either have no effect or actually make things worse.

You could look at changing the representation of your data, esp if your objects are small. For example, you could represent a table of data as a series of columns with object arrays for each column, rather than one object per row. This can save a significant amount of overhead for each object if you don't need to represent an individual row. e.g. a table with 12 columns and 10,000,000 rows could use 12 objects (one per column) rather than 10 million (one per row)

Backboard answered 25/4, 2009 at 15:41 Comment(1)
I agree that a memory profiler is a good starting point for someone who does not know which Class instances are taking all the memory. The question is more, if I know in advance I will have 10M pojo#1 in memory, how do minimize the consumption of each instance?Interjoin
T
18

You don't say what sort of objects you're looking to store, so it's a little difficult to offer detailed advice. However some (not exclusive) approaches, in no particular order, are:

  • Use a flyweight pattern wherever possible.
  • Caching to disc. There are numerous cache solutions for Java.
  • There is some debate as to whether String.intern is a good idea. See here for a question re. String.intern(), and the amount of debate around its suitability.
  • Make use of soft or weak references to store data that you can recreate/reload on demand. See here for how to use soft references with caching techniques.

Knowing more about the internals and lifetime of the objects you're storing would result in a more detailed answer.

Tarrant answered 25/4, 2009 at 15:34 Comment(0)
A
11

Ensure good normalization of your object model, don't duplicate values.

Ahem, and, if it's only millions of objects I think I'd just go for a decent 64 bit VM and lots of ram ;)

Aurore answered 25/4, 2009 at 15:55 Comment(4)
Which is quite possibly the most cost-effective solution :-)Tarrant
Great answer. Using caches of data and reducing duplicate records and fields is a major saver.Predictor
How do you minimize the number of duplicated values? Original question mentions usage of Enum, String.intern, object pools. How would you insure that values are not duplicated?Interjoin
@Interjoin There may be combinations (subsets) of values that are duplicate.Aurore
A
4

Normal "profilers" won't help you much, because you need an overview of all your "live" objects. You need heap dump analyzer. I recommend the Eclipse Memory analyzer.

Check for duplicated objects, starting with Strings. Check whether you can apply patterns like flightweight, copyonwrite, lazy initialization (google will be your friend).

Audryaudrye answered 25/4, 2009 at 21:20 Comment(0)
A
3

Take a look at this presentation linked from here. It lays out the memory use of common java object and primitives and helps you understand where all the extra memory goes.

Building Memory-efficient Java Applications: Practices and Challenges

Andrew answered 16/9, 2011 at 19:28 Comment(0)
S
2

You could just store fewer objects in memory. :) Use a cache that spills to disk or use Terracotta to cluster your heap (which is virtual) allowing unused parts to be flushed out of memory and transparently faulted back in.

Stockholder answered 25/4, 2009 at 22:57 Comment(0)
M
1

I want to add something to the point Peter alredy made(can't comment on his answer :() it's always better to use a memory profiler(check java memory profiler) than to go by intution.80% of time it's routine that we ignore has some problem in it.also collection classes are more prone to memory leaks.

Mullein answered 25/4, 2009 at 18:46 Comment(0)
C
1

If you have millions of Integers and Floats etc. then see if your algorithms allow for representing the data in arrays of primitives. That means fewer references and lower CPU cost of each garbage collection.

Carpophagous answered 29/4, 2009 at 9:20 Comment(0)
T
0

A fancy one: keep most data compressed in ram. Only expand the current working set. If your data has good locality that can work nicely.

Use better data structures. The standard collections in java are rather memory intensive.

[what is a better data structure]

  • If you take a look at the source for the collections, you'll see that if you restrict yourself in how you access the collection, you can save space per element.
  • The way the collection handle growing is no good for large collections. Too much copying. For large collections, you need some block-based algorithm, like btree.
Tamishatamma answered 28/4, 2009 at 21:7 Comment(1)
How would you define better data structures? How would you implement that?Interjoin
D
0

Spend some time getting acquainted with and tuning the VM command line options, especially those concerning garbage collection. While this won't change the memory used by your objects, it can have a big impact on performance with memory-intensive apps on machines with a lot of RAM.

Diatonic answered 29/4, 2009 at 8:19 Comment(0)
O
0
  1. Assign null value to all the variables which are no longer used. Thus make it available for Garbage collection.
  2. De-reference the collections once usage is over, otherwise GC won't sweep those.
Omura answered 6/4, 2010 at 6:38 Comment(1)
I disagree with item 1. I would just let the gc do what it is suppose to do. There are only a few cases (arrays, collections) where this could be useful, not all variables. #449909Interjoin
P
0

1) Use right dataTypes wherever possible

Class Person {
 int age;
 int status;
}

Here we can use below variables to save memory while sending Person object

class Person{
  short age;
  byte status;
}

2) Instead of returning new ArrayList<>(); from method , you can use Collection.emptyList() which will only contain only one element instead of default 10;

For e.g

public ArrayList getResults(){
    ..... 
    if(failedOperation)
        return new ArrayList<>();
}
//Use this
public ArrayList getResults(){
    if(failedOperation)
       return Collections.emptyList();
}

3 ) Move creation of objects in methods instead of static declaration wherever possible as fields of objects will be stored on stack instead of heap

4) Using binary formats like protobuf,thrift,avro,messagepack for reducing intercommunication instead of json or XML

Promethean answered 23/12, 2019 at 10:53 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.