How to resolve poor nHibernate collection initialization
Asked Answered
B

3

2

nHibernate3; retrieving 4xxx records out of an EAV data schema. When nHibernate, or .NET, goes to initialize those collections for the first time, we're seeing a severe penalty. Subsequent calls appear to perform more efficiently. Running the same queries in SQL Server Management Studio result in expected quick return times.

Using Fluent and runtime mapping instead of .hbm.xml; curious if serialized mapping would help here?

nHibernate Profiler and log4net logging didn't seem to give me much to go on. A total of something like 140,000 entities are hydrated in this process.

Attached a screenshot of my dotTrace performance tracing that shows the collection initialization penalty: dotTrace of slow nHibernate collection initialization

Have tried join and eager fetchtypes, with no apparent results, but am not 100% certain I implemented those correctly -- does just the parent need to be so designated, or do the children tables also need to be flagged?

var products = ((HandleSession)_handleSession).Session.CreateCriteria(typeof(Product))
                    .SetFetchMode("Product", FetchMode.Eager)
                    .List<Product>()
                    .AsEnumerable();

With reflection optimizer enabled (I think) via web.config: With reflection optimizer enabled

This is where most time is spent:

return new ProductList(products.Select(p => p.ToProductContract()));

Which is simply an extension method doing this:

public static ProductContract ToProductContract(this Product product)
        {
            return new ProductContract
                       {
                           Name = product.ProductName,
                           ProductTypeName = product.ProductType.ProductTypeName,
                           UpdateTimeStamp = product.UpdateDateTime,
                           ProductNumber = product.ProductNumber,
                           Attributes = product.ProductAttributes.ToCommonAttribute().ToList(),
                           GroupCategories = product.ProductGroups.ToGroupCategory().ToList(),
                           PublicUniqueId = product.PublicUniqueId
                       };
        }

mappings:

internal class ProductMapping : ClassMap<Product>
    {
        private const string _iscurrentindicator = "IsCurrentIndicator=1";

        public ProductMapping()
        {
            Table("Product");
            Id(Reveal.Member<Product>("ProductId")).GeneratedBy.Identity().Column("ProductID");
            Map(x => x.ProductNumber).Column("ProductNumber").Not.Nullable();
            Map(x => x.ProductName).Column("ProductName").Not.Nullable();
            Map(x => x.InsertDateTime).Column("InsertedDateTime").Nullable().ReadOnly();
            Map(x => x.UpdateDateTime).Column("UpdatedDateTime").Nullable();
            Map(x => x.PublicUniqueId).Column("ProductGUID").Generated.Insert();

            References(x => x.ProductType).Column("ProductTypeId").Not.Nullable();
            HasMany(x => x.ProductAttributes)
                .KeyColumn("ProductId")
                .Inverse()
                .Fetch
                .Subselect()
                .Where(_iscurrentindicator)
                .Cascade
                .SaveUpdate();

            HasMany(x => x.ProductGroups).KeyColumn("ProductId").Fetch.Subselect().Where(_iscurrentindicator);
            DynamicUpdate();
            DynamicInsert();
            BatchSize(500);
        }
    }

internal class ProductGroupMapping : ClassMap<ProductGroup>
    {
        public ProductGroupMapping()
        {
            Table("ProductGroup");
            Id(x => x.ProductGroupId).Column("ProductGroupId").GeneratedBy.Identity();
            References(x => x.Product).Column("ProductId").Not.Nullable();
            References(x => x.Group).Column("GroupId").Not.Nullable();
            //Where("IsCurrentIndicator=1");
        }
    }

internal class ProductAttributeMapping : ClassMap<ProductAttribute>
    {
        public ProductAttributeMapping()
        {
            Table("ProductAttribute");
            LazyLoad();
            Id(x => x.ProductAttributeId).GeneratedBy.Identity().Column("ProductAttributeID");
            References(x => x.Product).Column("ProductID").Not.Nullable();
            References(x => x.Attribute).Column("AttributeID").Not.Nullable().Fetch.Join();
            Map(x => x.PositionNumber).Column("PositionNumber").Nullable();
            Map(x => x.ValueText).Column("ValueText").Nullable();
            Map(x => x.ValueBinary).Column("ValueBinary").Nullable();

            Component(x => x.OperationalAuditHistory, m =>
                        {
                            Table("ProductAttribute");
                            m.Map(x => x.ExpirationDateTime).Column("ExpirationDateTime").Nullable();
                            m.Map(x => x.IsCurrent).Column("IsCurrentIndicator").Not.Nullable();
                            m.Map(x => x.OperationCode).Column("OperationCode").Nullable();
                            m.Map(x => x.OperationDateTime).Column("OperationDateTime").Nullable();
                            m.Map(x => x.OperationSystemName).Column("OperationSystemName").Nullable();
                            m.Map(x => x.OperationUserName).Column("OperationUserName").Nullable();
                            m.Map(x => x.LastUserPriority).Column("LastUserPriority").Nullable();
                        });

            DynamicInsert();
            BatchSize(50);
        }
    }

Unfortunately with .Future I still appear to get similar results. Here's a new trace; I've switched to Release, and x64 for the key projects, for the moment, so the times are lower, but the proportions are still pretty much the same; as well as with .Eager:

var products = ((HandleSession) _handleSession).Session.CreateCriteria(typeof (Product))
                    .SetFetchMode("ProductAttribute", FetchMode.Join)
                    .SetFetchMode("ProductGroup", FetchMode.Join)
                    .SetFetchMode("ProductType", FetchMode.Join)
                    .Future<Product>()
                    .AsEnumerable();

dotTrace - Release mode, targeting x64, with .Future()

Generated SQL with .Eager and .Future in place:

SELECT this_.ProductID as ProductID0_1_, this_.ProductNumber as ProductN2_0_1_, this_.ProductName as ProductN3_0_1_, this_.InsertedDateTime as Inserted4_0_1_, this_.UpdatedDateTime as UpdatedD5_0_1_, this_.ProductGUID as ProductG6_0_1_, this_.ProductTypeId as ProductT7_0_1_, producttyp2_.ProductTypeID as ProductT1_6_0_, producttyp2_.ProductTypeName as ProductT2_6_0_ FROM Product this_ inner join ProductType producttyp2_ on this_.ProductTypeId=producttyp2_.ProductTypeID;

SELECT productatt0_.ProductId as ProductId2_, productatt0_.ProductAttributeID as ProductA1_2_, productatt0_.ProductAttributeID as ProductA1_2_1_, productatt0_.PositionNumber as Position2_2_1_, productatt0_.ValueText as ValueText2_1_, productatt0_.ValueBinary as ValueBin4_2_1_, productatt0_.ProductID as ProductID2_1_, productatt0_.AttributeID as Attribut6_2_1_, productatt0_.ExpirationDateTime as Expirati7_2_1_, productatt0_.IsCurrentIndicator as IsCurren8_2_1_, productatt0_.OperationCode as Operatio9_2_1_, productatt0_.OperationDateTime as Operati10_2_1_, productatt0_.OperationSystemName as Operati11_2_1_, productatt0_.OperationUserName as Operati12_2_1_, productatt0_.LastUserPriority as LastUse13_2_1_, attribute1_.AttributeId as Attribut1_1_0_, attribute1_.AttributeName as Attribut2_1_0_, attribute1_.DisplayName as DisplayN3_1_0_, attribute1_.DataTypeName as DataType4_1_0_, attribute1_.ConstraintText as Constrai5_1_0_, attribute1_.ConstraintMin as Constrai6_1_0_, attribute1_.ConstraintMax as Constrai7_1_0_, attribute1_.ValuesMin as ValuesMin1_0_, attribute1_.ValuesMax as ValuesMax1_0_, attribute1_.Precision as Precision1_0_ FROM ProductAttribute productatt0_ inner join Attribute attribute1_ on productatt0_.AttributeID=attribute1_.AttributeId WHERE (productatt0_.IsCurrentIndicator=1) and productatt0_.ProductId in (select this_.ProductID FROM Product this_ inner join ProductType producttyp2_ on this_.ProductTypeId=producttyp2_.ProductTypeID)

SELECT productgro0_.ProductId as ProductId1_, productgro0_.ProductGroupId as ProductG1_1_, productgro0_.ProductGroupId as ProductG1_3_0_, productgro0_.ProductId as ProductId3_0_, productgro0_.GroupId as GroupId3_0_ FROM ProductGroup productgro0_ WHERE (productgro0_.IsCurrentIndicator=1) and productgro0_.ProductId in (select this_.ProductID FROM Product this_ inner join ProductType producttyp2_ on this_.ProductTypeId=producttyp2_.ProductTypeID)

Brigette answered 15/4, 2011 at 21:45 Comment(6)
try turning reflection optimizer onVolva
was unfamiliar with that, will take a look, thanks -- do you happen to have code for it, as opposed to the xml config I seem to be finding via Google?Brigette
Adding screenshot with reflection optimization enabled via web.config (I think).Brigette
reflection optimization is on by default, unless you specifically turned it off. Please post your mappings for your collections.Requisite
I don't think you can optimize it further at this point. I think you will need to drill down which data you actually need and narrow it down using HQL in your case. 140 000 entities is a lot for NHibernate to process, and I have yet to find optimization that will handle that. I suffering from the same performance penalty myself, and will try to solve it by using simpler presentation entities that I populate with basic HQL-fetched fields my self. It all comes down to usage scenario.Requisite
Thanks jishi, that's kind of what I've concluded as well. And/or to avoid nHibernate entirely for this operation.Brigette
L
8

1) A serialized mapping will only help to reduce the time required to build the SessionFactory. If the above query is not the first access to the database, it will not accomplish anything in that regard.

2) Set FetchMode needs to be applied to the children, like this:

var products = ((HandleSession)_handleSession).Session.CreateCriteria(typeof(Product))
                .SetFetchMode("ProductChildren", FetchMode.Eager)
                .List<Product>()
                .AsEnumerable();

3) This looks like a N+1 problem, if I interpret the methods in the Screenshots correctly. Are you transforming the Products in your query result to a list of ProductDTOs? If so, it seems as if the child collections are lazy loaded from the DB within a loop.

Edit:

In order to combat the N+1 Select, we will have to tell NHibernate to load everything beforehand, preferably with Futures. Here is a potential solution that basically fetches all your data from the db with a handful of Select-statements. I did not include any Where-conditions. Those you would have to add accordingly.

// any where-condition will have to be applied here and in the subsequent queries
var products = session.QueryOver<Product>()
    .Future();

var products2 = session.QueryOver<Product>()
    .Fetch(p => p.ProductType).Eager
    .Future();

var products3 = session.QueryOver<Product>()
    .Fetch(p => p.ProductAttributes).Eager
    .Future();

var products4 = session.QueryOver<Product>()
    .Fetch(p => p.ProductGroups).Eager
    .Future();

// Here we execute all of the above queries in one roundtrip.
// Since we already have all the data we could possibly want, there is no need
// for a N+1 Select.
return new ProductList(products.Select(p => p.ToProductContract()));
Leah answered 16/4, 2011 at 0:18 Comment(15)
It has been the first access to the DB in my testing so far; I just got on this project, so I haven't had a chance to really flesh out testing scenarios, I'm not sure if it's a first-access issue or not, I will find out.Brigette
Yes, I believe that is effectively what is being done in ToProductList. I will verify on Monday.Brigette
Applying .Eager to the two children of the Product table has not improved performance. Also, it is not a first-access issue, so serialized mapping is probably not going to gain me anything. It is the conversion of entities to DTOs that is in fact the issue. Will post code above; is there a more efficient means of populating these collections, or is this an inevitable effect of EAV schema?Brigette
@Brigette Did you try to apply FetchMode.Eager to .ProductType, .ProductAttributes and .ProductGroups. All these will definitely cause a N+1 Select because of the loop. Oh, and .ToCommonAttribute() and .ToGroupCategory() may cause a similar issue. To get around this you will probably have to get all the data first, before you transform the object into a DTO. It really looks like loading 4000+ entities from the database, with child collections and grandchildren results in 140000 select statements. You can verify that in the log4net NHibernate.SQL output.Leah
@Brigette If you comment out the three lines for ProductTypeName, Attributes and GroupCategories and then uncomment each one you should be able to narrow down the issue. Also, see, if .ToCommonAttribute() and .ToGroupCategory() are lazy loading anything else from the database.Leah
Yes, as stated, 140,000 calls are made, entities are hydrated here. Initially default select join was used, I tried .Join and .Eager with no obviously detectable change in behavior or performance. There are only two queries shown in SQL Server profiler or nHibernate Profiler ...Brigette
It should be atleast 3 queries, one for Products, and two subsequent for each collection in order to prevent a "cartesian product"Requisite
Commenting out those lines unfortunately results in no results being returned, period, without further editing at some as-yet undetermined location.Brigette
@Brigette I added a solution with QueryOver. If you need the ICriteria (MultiCriteria) version and don't know how to translate that, let me know.Leah
Unfortunately .Future doesn't seem to be doing anything to help. Code and dotTrace above.Brigette
@Brigette Could you post the SQL that is generated? I am strarting to think that my assumption regarding the N+1 Select was wrong. (.Future() only helps when having more than one query, that's why I added it to my suggestion.)Leah
@Brigette I think I may have been approaching this from the wrong angle. As jishi stated, NHibernate may be too slow for this amount of data when it has to transform the 140K+ rows in the ResultSet into actual entities. Since you are converting the entities to DTOs anyway, it might make more sense to create the DTOs directly from the query (projections), thus freeing NHibernate from the responsibility of hydrating complex entities. If you are open to that solution, I will see if I can put something together. (I am not sure whether this will even work but I am willing to give it a try.)Leah
thx Florian; this was something myself and another dev had discussed, but neither was quite familiar with what that looked like, and hadn't had a chance to explore it yet.Brigette
@Brigette Just to let you know, I tested a few possible solutions but even using a StatelessSession and projections (I did not solve the issue with child collections, but never mind) you would only be able to save about 60-70% of the time, which would still take too long, I suppose. So I concur with jishi and suggest what you apparently came up with now. Don't use NH in that particular case, but resort to plain SQL and put the entities together yourself.Leah
Thanks Florian; I couldn't get stateless sessions to work due to all the attribute collections. Any chance you could share your code on that, or on projections, which I started on as well, but had my priorities changed for me by external dependencies? We managed to improve it somewhat significantly by flattening or removing a component association; not 100% sure what was done here, I'll review code next week, another dev was on this. In the meanwhile I've also implemented simple AppFabric caching of materialized objects, along with a service startup priming action for this "big" set/map.Brigette
R
1

One option is to enable batch-size on your collections. I assume those are lazy, and with batch size enabled, it would try to fetch collections for multiple entities in a single roundtrip.

It doesn't make a difference if you fetch 1 entity with one collection, but can make a huge difference if you select 1000 entities which all has one collection. Using a batch-size of 1000 would result in 2 queries instead of 1001.

Tried to find some documentation, but only found this example:

nhibernate alternates batch size

Using join strategies in your case would result in gigantic resultsets so that is not a good option. A better option would be to use FetchMode.Select which would explicitly force your collections to be loaded in a subsequent roundtrip.

Another thing that could improve performance is setting:

Session.FlushMode = FlushMode.Never;

Which disable automatic flushing of your scope. This is useful if all you actually do is reading data, not modifying it. However, you would see calls to IsDirty or any other check for dirty objects in your callstack.

Requisite answered 18/4, 2011 at 15:1 Comment(7)
Batch size is already enabled; it was 50, I tried 500 also, no change in behavior.Brigette
Also, there are only two queries here -- it's the collection initialization and population that is taking all the time. Well, that, and the MVC engine rendering the results.Brigette
My experience is that NHibernate is pretty slow at transforming the data from a resultset into entities when you reach a decent amount of objects. Acccording to the screenshot, you deal with results as big as 100 000+ rows, is that correct?Requisite
How many rows do you actually have in your DB? for Product, CommonAttribute and GroupCategory for this specific query?Requisite
~4000 Products, 2 ProductTypes, 60 Attributes, ~140,000 ProductAttribute rows.Brigette
I believe a resultset that is only a few columns wide but 140 000 long shouldn't be as slow as indicated. But if it is part of a multiple joins, then the result it needs to get is way bigger. I think we need to know more about your data structure in order to give you useful hints.Requisite
I've come to suspect the issue might very much be an outcome of someone else's previous decision to use an EAV schema. I'm going to try Florian's setup above; if that doesn't work, I'll post schema for further investigation or comment, thanks!Brigette
B
1

If you are using this session only for reporting, you have to use Stateless Sessions: http://nhforge.org/blogs/nhibernate/archive/2008/10/30/bulk-data-operations-with-nhibernate-s-stateless-sessions.aspx

Brookes answered 20/4, 2011 at 5:3 Comment(1)
Thanks, this was mentioned on the nhusers list last night as well, I will give it a look!Brigette

© 2022 - 2024 — McMap. All rights reserved.