Hibernate associations using too much memory
Asked Answered
A

7

11

I have a table "class" which is linked to tables "student" and "teachers". A "class" is linked to multiple students and teachers via foriegn key relationship.

When I use hibernate associations and fetch large number of entities(tried for 5000) i am seeing that it is taking 4 times more memory than if i just use foreign key place holders. Is there something wrong in hibernate association?

Can i use any memory profiler to figure out what's using too much memory?

This is how the schema is:

class(id,className) 

student(id,studentName,class_id)
teacher(id,teacherName,class_id)

class_id is foreign key..

Case #1 - Hibernate Associations

1)in Class Entity , mapped students and teachers as :

@Entity
@Table(name="class")
public class Class {

private Integer id;
private String className;

private Set<Student> students = new HashSet<Student>();
private Set<Teacher> teachers = new HashSet<Teacher>();

@OneToMany(fetch = FetchType.EAGER, mappedBy = "classRef")
@Cascade({ CascadeType.ALL })
@Fetch(FetchMode.SELECT)
@BatchSize(size=500)
public Set<Student> getStudents() {
    return students;
}

2)in students and teachers , mapped class as:

@Entity
@Table(name="student")
public class Student {

private Integer id;
private String studentName;
private Class classRef;

@ManyToOne
@JoinColumn(name = "class_id")
public Class getClassRef() {
    return classRef;
}

Query used :

sessionFactory.openSession().createQuery("from Class where id<5000");

This however was taking a Huge amount of memory.

Case #2- Remove associations and fetch seperately

1)No Mapping in class entity

@Entity
@Table(name="class")
public class Class {

private Integer id;
private String className;

2)Only a placeholder for Foreign key in student, teachers

@Entity
@Table(name="student")
public class Student {

private Integer id;
private String studentName;
private Integer class_id;

Queries used :

sessionFactory.openSession().createQuery("from Class where id<5000");
sessionFactory.openSession().createQuery("from Student where class_id = :classId");
sessionFactory.openSession().createQuery("from Teacher where class_id = :classId");

Note - Shown only imp. part of the code. I am measuring memory usage of the fetched entities via JAMM library.

I also tried marking the query as readOnly in case #1 as below, which does not improve memory usage very much ; just a very little. So that's not the solve.

    Query query = sessionFactory.openSession().
            createQuery("from Class where id<5000");

    query.setReadOnly(true);
    List<Class> classList = query.list();
    sessionFactory.getCurrentSession().close();

Below are the heapdump snapshots sorted by sizes. Looks like the Entity maintained by hibernate is creating the problem..

Snapshot of Heapdump for hibernate associations program Snapshot of Heapdump for hibernate associations program

Snapshot of heapdump for fetching using separate entities Snapshot of heapdump for fetching using separate entities

Amylolysis answered 10/2, 2016 at 19:13 Comment(3)
In case #2, I think the queries should be from Student where class_id < 5000"); instead of from Student where class_id = :classId"); to reflect the case #1with separate queries. Same applies for the Teacher select query as well. Can you post the memory observations with these changes?Dyaus
Well , all the fetched classes go to a collection and then i iterate over that collection inside which i execute the query "from Student where class_id = :classId")" So its kind of the same. I don't see why that can be a reason for this memory analysis.Amylolysis
I agree. I wanted to make sure that I got the problem statement right.Dyaus
I
7

You are doing a EAGER fetch with the below annotation. This will in turn fetch all the students without even you accessing the getStudents(). Make it lazy and it will fetch only when needed.

From

@OneToMany(fetch = FetchType.EAGER, mappedBy = "classRef")

To

   @OneToMany(fetch = FetchType.LAZY, mappedBy = "classRef")
Impious answered 10/2, 2016 at 19:17 Comment(5)
The same is being done in case #2 without using hibernate utitlities. That takes 4 times less memory...Amylolysis
Lazy loading is the key feature of hibernate use it if you really don't require collections(child object) at same time you are fetching parent objects.Thinking
I want "all" the collections. I don't get how lazy loading is solving the problem , in both the cases i fetch all the child objects. Only in the hibernate way it takes too much memory and that is what the problem is. So lazy loading is not gonna solve the problem.Amylolysis
Having Eager fetching for @OneToMany is not a best practice as this sets eager fetch as default behavior for this association for the entire application. Make it lazy in entity class and use ` FetchMode` (join/select) in your Criteria, if you really need eager fetching.Externalism
@Externalism : I know Lazy fetching is not going to load everything in one shot and hence it will reduce memory. But the problem is in case #1 , i do Load everything and that takes less memory than case #2. Since both of them load pretty much the same data , it should not be a noticeable difference. I hope it clears the reason why i did not accept this as an answerAmylolysis
P
4

When Hibernate loads a Class entity containing OneToMany relationships, it replaces the collections with its own custom version of them. In the case of a Set, it uses a PersistentSet. As can be seen on grepcode, this PersistentSet object contains quite a bit of stuff, much of it inherited from AbstractPersistentCollection, to help Hibernate manage and track things, particularly dirty checking.

Among other things, the PersistentSet contains a reference to the session, a boolean to track whether it's initialized, a list of queued operations, a reference to the Class object that owns it, a string describing its role (not sure what exactly that's for, just going by the variable name here), the string uuid of the session factory, and more. The biggest memory hog among the lot is probably the snapshot of the unmodified state of the set, which I would expect to approximately double memory consumption by itself.

There's nothing wrong here, Hibernate is just doing more than you realized, and in more complex ways. It shouldn't be a problem unless you are severely short on memory.

Note, incidentally, that when you save a new Class object that Hibernate previously was unaware of, Hibernate will replace the simple HashSet objects you created with new PersistentSet objects, storing the original HashSet wrapped inside the PersistentSet in its set field. All Set operations will be forwarded to the wrapped HashSet, while also triggering PersistentSet dirty tracking and queuing logic, etc. With that in mind, you should not keep and use any external references to the Set from before saving, and should instead fetch a new reference to Hibernate's PersistentSet instance and use that if you need to make any changes (to the set, not to the students or teachers within it) after the initial save.

Potbellied answered 21/3, 2016 at 0:29 Comment(6)
Hmm..I really hope there's a way to disable it. I think we can work without those features in most of our cases.Its a concern as it we have such a case of over 3L entities and it's eating up too much memory.Amylolysis
@Amylolysis Which "those features"? Change detection? You always need that unless you're doing read-only operations, in which case setting the read-only flag on either the session or the query might help.Potbellied
Yeah..the unmodified state, dirty checking etc...will try to set read only and see what happens...Amylolysis
Does this happen only in one-to-many relationships or everytime? If so, this would happen in case #2 also . I did try to set readOnly in case #1 , does Not help very much, just a very little improvement. I am updating the actual question to show the change that i did.Amylolysis
@Amylolysis This happens for fields/relationships of collection types - so, one-to-many or many-to-many. Making it a unidirectional many-to-one (Class as in case #2, Student as in case #1) should use about the same memory as case #2 as is, I think, possibly a little less due to not duplicating the id number in memory. Also, I'm not sure if it would make any difference, but have you tried making the session, rather than the query, read-only? And are you sure the read-only test query is the first time the entities are loaded, so there's no possibility of pulling from cached non-read-only copies?Potbellied
I am sure the read-only query loads the object the very first time. I am using two standalone java programs for Case #1, Case#2 for analysis. Looks like hibernate v < 3.5 does not support session.setDefaultReadOnly() , so i will try to upgrade my programs to use that hibernate version...Amylolysis
D
2

Regarding the huge memory consumption you are noticing, one potential reason is Hibernate Session has to maintain the state of each entity it has loaded the form of EntityEntry object i.e., one extra object, EntityEntry, for each loaded entity. This is needed for hibernate automatic dirty checking mechanism during the flush stage to compare the current state of entity with its original state (one that is stored as EntityEntry).

Note that this EntityEntry is different from the object that we get to access in our application code when we call session.load/get/createQuery/createCriteria. This is internal to hibernate and stored in the first level cache.

Quoting form the javadocs for EntityEntry :

We need an entry to tell us all about the current state of an object with respect to its persistent state Implementation Warning: Hibernate needs to instantiate a high amount of instances of this class, therefore we need to take care of its impact on memory consumption.

One option, assuming the intent is only to read and iterate through the data and not perform any changes to those entities, you can consider using StatelessSession instead of Session.

The advantage as quoted from Javadocs for Stateless Session:

A stateless session does not implement a first-level cache nor interact with any second-level cache, nor does it implement transactional write-behind or automatic dirty checking

With no automatic dirty checking there is no need for Hibernate to create EntityEntry for each entity of loaded entity as it did in the earlier case with Session. This should reduce pressure on memory utilization.

Said that, it does have its own set of limitations as mentioned in the StatelessSession javadoc documentation.

One limitation that is worth highlighting is, it doesn't lazy loading the collections. If we are using StatelessSession and want to load the associated collections we should either join fetch them using HQL or EAGER fetch using Criteria.

Another one is related to second level cache where it doesn't interact with any second-level cache, if any.

So given that it doesn't have any overhead of first-level cache, you may want to try with Stateless Session and see if that fits your requirement and helps in reducing the memory consumption as well.

Dyaus answered 23/3, 2016 at 16:15 Comment(3)
Makes sense. But looks like Stateless Sessions can't work with collections. It fails with below error : "collections cannot be fetched by a stateless session"Amylolysis
@Amylolysis That is one of the limitations of StatelessSession. As I mentioned in the answer you should be able to overcome that - If we are using StatelessSession and want to load the associated collections we should either join fetch them using HQL or EAGER fetch using CriteriaDyaus
Updated my question to show heapdump data. Looks like EntityEntry is playing a role definitely. Will try your suggestion of Criteria usage..Amylolysis
A
0

Yes, you can use a memory profiler, like visualvm or yourkit, to see what takes so much memory. One way is to get a heap dump and then load it in one of these tools.

However, you also need to make sure that you compare apples to apples. Your queries in case#2 sessionFactory.openSession().createQuery("from Student where class_id = :classId"); sessionFactory.openSession().createQuery("from Teacher where class_id = :classId");

select students and teachers only for one class, while in case #1 you select way more. You need to use <= :classId instead.

In addition, it is a little strange that you need one student and one teacher record per one class. A teacher can teach more than one class and a student can be in more than one class. I do not know what exact problem you're solving but if indeed a student can participate in many classes and a teacher can teach more than one class, you will probably need to design your tables differently.

Anatol answered 21/3, 2016 at 18:12 Comment(2)
all the fetched classes go to a collection and then i iterate over that collection inside which i execute the query "from Student where class_id = :classId")" So its kind of the same. I don't see why that can be a reason for this memory analysis. Regarding the association, i have one-to-many between class & students,teachers. Yes one student can be in many classes ideally, but this is is just an example use case to demo the actual problem we are facing(with same association scheme:) )Amylolysis
Did you try to get a heap dump and then look at it? This can shed some light at what objects are loaded. BTW, this looks like this issue: #1995580Anatol
E
0

Try @Fetch(FetchMode.JOIN), This generates only one query instead of multiple select queries. Also review the generated queries. I prefer using Criteria over HQL(just a thought).

For profiling, use freewares like visualvm or jconsole. yourkit is good for advanced profiling, but it is not for free. I guess there is a trail version of it.

You can take the heapdump of your application and analyze it with any memory analyzer tools to check for any memory leaks.

BTW, I am not exactly sure about the memory usage for current scenario.

Externalism answered 23/3, 2016 at 9:58 Comment(0)
M
0

Its likely the reason is the bi-directional link from Student to Class and Class to Students. When you fetch Class A (id 4500), The Class object must be hydrated, in turn this must go and pull all the Student objects (and teachers presumably) associated with this class. When this happens each Student Object must be hydrated. Which causes the fetch of every class the Student is a part of. So although you only wanted class A, you end up with:

Fetch Class A (id 4900) Returns Class A with reference to 3 students, Student A, B, C. Student A has ref to Class A, B (id 5500) Class B needs hydrating Class B has reference to Students C,D Student C needs hydrating Student C only has reference to Class A and B Student C hydration complete. Student D needs hydrating Student D only has reference to Class B Student B hydration complete Class B hydration complete Student B needs hydrating (from original class load class A)

etc... With eager fetching, this continues until all links are hydrated. The point being that its possible you end up with Classes in memory that you didn't actually want. Or whose id is not less than 5000.

This could get worse fast.

Also, you should make sure you are overriding the hashcode and equals methods. Otherwise you may be getting redundant objects, both in memory and in your set.

One way to improve is either change to LAZY loading as other have mentioned or break the bidirectional links. If you know you will only ever access students per class, then don't have the link from student back to class. For student/class example it makes sense to have the bidirectional link, but maybe it can be avoided.

Mussman answered 23/3, 2016 at 13:5 Comment(1)
Well, in my case a student is a part of only one class & same applies for a teacher. I know doesn't make sense in a student/teacher scenario but this is just a representation of the actual issue we are facing...Amylolysis
C
0

as you say you "I want "all" the collections". so lazy-loading won't help. Do you need every field of every entity? In which case use a projection to get just the bits you want. See when to use Hibernate Projections. Alternatively consider having minimalist Teacher-Lite and Student-Lite entity that the full-fat versions extend.

Cottar answered 23/3, 2016 at 14:23 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.