How to Eager Load Associations without duplication in NHibernate?
Asked Answered
R

2

4

I'd need to load a list of very large objects with so many children and children of children. what's the best approach to take?

I'm using Oracle 11g database and I've written the below method but it results in cartesian product (duplicated results):

 public IList<ARNomination> GetByEventId(long eventId)
        {
            var session = this._sessionFactory.Session;

            var nominationQuery = session.Query<ARNomination>().Where(n => n.Event.Id == eventId);

            using (var trans = session.Transaction)
            {
                trans.Begin();

                // this will load the Contacts in one statement
                nominationQuery
                    .FetchMany(n => n.Contacts)
                    .ToFuture();

                // this will load the CustomAttributes in one statement
                nominationQuery
                    .FetchMany(n => n.CustomAttributes)
                    .ToFuture();

                // this will load the nominations but joins those two tables in one statement which results in cartesian product
                nominationQuery
                    .FetchMany(n => n.CustomAttributes)
                    .FetchMany(n => n.Contacts)
                    .ToFuture();

                trans.Commit();
            }

            return nominationQuery.ToList();
        }
Referendum answered 7/1, 2014 at 11:39 Comment(1)
Possible duplicate of How to get a distinct result with nHibernate and QueryOver API?Kath
E
12

Fetching Collections is a difficult operation. It has many side effects (as you realized, when there are fetched more collections). But even with fetching one collection, we are loading many duplicated rows.

In general, for collections loading, I would suggest to use the batch processing. This will execute more SQL queries... but not so much, and what is more important, you can do paging on the root list ARNomination.

See: 19.1.5. Using batch fetching you can find more details.

You have to mark your collections and/or entities with an attribute batch-szie="25".

xml:

<bag name="Contacts" ... batch-size="25">
...

fluent:

HasMany(x => x.Contacts)
  ...
  .BatchSize(25)

Please, check few arguments here:

Ecclesiastes answered 7/1, 2014 at 11:46 Comment(5)
I'd need to load a list of very large objects with so many children and children of children. what's the best approach to take?Referendum
I am telling you from my experience... I tried many ways... but the best is simply to use batching. Let's think about it this way: With the batching, we will have all the entities light (I mean referencing lists and entities lazily). But once we need a list of succh an entity... with many related collections atc.. the batch-size will reduce the amount of sql queries, while we are still asking for one simple object. I tried other ways, fetching, not-lazy.. etc. but at the end it breaks all the flexibility. We do use the batch-size in mapping everywhere... and we do have flexibility ;)Brebner
Thanks, have you a got a complete example with the batching with Fluent NHibernate?Referendum
There is nothing more then the setting .BatchSize(25) to any of your collections. That's it. No more mapping. From that moment NHibernate will behave as described in the 19.1.5 nhforge.org/doc/nh/en/index.html#performance-fetching-batch. So easy is the usage of this feature. Just play with the number... for me 25 was ok, and became standard ;) (of course, from that moment do not fetch collections)Brebner
Really impressed with this feature, was struggling to load very deep and complex object graph with Eager Loading and Distinct Root Entity Transformer. I was experiencing massive duplication of child collections which was swelling the entire object graph. Removed the eager loading and implementing batching - the whole thing populates much, much quicker!! +1Scalpel
S
1

I concur with @RadimKöhler as soon as you eager load more than one collection then a Cartesian product always occurs. For selecting a suitable batch size then I would probably choose this to be the same as the page size as it just feels right... (no evidence why though)

There is another technique that you may feel is a better fit and that is to read this blog post by Ayende which shows you how you can send two future queries at the same time to eager load multiple collections that soul job is to load each collection singly.

However whichever route you take I suggest throwing a profiler at the results to see which performs better for you...

Scorify answered 7/1, 2014 at 13:44 Comment(1)
Nice link to Ayende post. The page size, I do agree, is about profiling... exactlyBrebner

© 2022 - 2024 — McMap. All rights reserved.