Hibernate Subselect vs Batch Fetching
Asked Answered
C

2

15

Hibernate provides (at least) two options for getting around the N+1 query problem. The one is setting the FetchMode to Subselect, which generates a select with a IN-clause and a subselect within this IN-clause. The other is to specify a BatchSize, which generates a select with a IN-clause containing the parents' IDs.

Both work but I find that the Subselect option often runs into performance problems due to the query for the parents being complex. On the other hand, with a large BatchSize (say 1000), the number of queries and complexity of those queries are very small.

My question is thus: when would you use Hibernate's Subselect FetchMode over BatchSize? Subselect probably makes sense if you have a very large number of parent entries (thousands), but are there any other scenarios where you'd prefer a Subselect to BatchSize?

EDIT: I noticed a difference between the two when dealing with eager loading. If you have an xToMany association set to be loaded eagerly and through a subselect, it generates a subselect like it would if it was lazy. If you specify a BatchSize however, the generated query makes use of a outer join instead of a seperate query. Is there any way to force Hibernate to use a seperate batched query when loading eagerly?

Ci answered 30/8, 2011 at 5:42 Comment(0)
G
14

I don't use subselect, because it is hard to control. In a very large system which has complex business logic and a large team working on it, it is too hard to say which queries are used. Subselect may work in specific cases where you exactly know which query is performed.

Batch fetching has some big advantages. It is not always the fastest, but usually fast enough. On the other hand it is very stable, doesn't have any side effects and is completely transparent to the business logic. I never use batch values higher then 100. It is sufficient to reduce the N+1 to some reasonable amount of queries.

Gurtner answered 30/8, 2011 at 6:28 Comment(3)
I don't understand why subselect is hard to control. Can you throw some light?Oily
Subselect depends on the query that had been executed before. This query may be very complex, e.g. using a lot of other tables and filtering by non-indexed columns. It is therefore hard to say if the subquery approach is likely to enhance performance.Gurtner
Another problem with subselect can crop up with MySQL; MySQL (5.5 and below) has dreadful performance with nested queries, because it forcibly makes them correlated, and reevaluates them for every row in the parent query. I can't find any other way for Hibernate to generate a nested query for annotated relations, so avoiding subselect will prevent a nasty surprise with MySQL.Epergne
V
2

I found this article to be helpful. I believe batch-fetching can be applied on both the collection and the parent, while subselect can only be applied on a collection.

In case of a fetching strategy for collections, a subselect will be executed once (because the batch-size is effectively infinity), while with batch-fetching the SQL statement might be executed multiple times.

Versed answered 29/6, 2012 at 12:11 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.