Entity Framework Table Per Type Performance
Asked Answered
X

2

6

So it turns out that I am the last person to discover the fundamental floor that exists in Microsoft's Entity Framework when implementing TPT (Table Per Type) inheritance.

Having built a prototype with 3 sub classes, the base table/class consisting of 20+ columns and the child tables consisting of ~10 columns, everything worked beautifully and I continued to work on the rest of the application having proved the concept. Now the time has come to add the other 20 sub types and OMG, I've just started looking the SQL being generated on a simple select, even though I'm only interested in accessing the fields on the base class.

This page has a wonderful description of the problem.

Has anyone gone into production using TPT and EF, are there any workarounds that will mean that I won't have to: a) Convert the schema to TPH (which goes against everything I try to achieve with my DB design - urrrgghh!)? b) rewrite with another ORM?

The way I see it, I should be able to add a reference to a Stored Procedure from within EF (probably using EFExtensions) that has the the TSQL that selects only the fields I need, even using the code generated by EF for the monster UNION/JOIN inside the SP would prevent the SQL being generated every time a call is made - not something I would intend to do, but you get the idea.

The killer I've found, is that when I'm selecting a list of entities linked to the base table (but the entity I'm selecting is not a subclass table), and I want to filter by the pk of the Base table, and I do .Include("BaseClassTableName") to allow me to filter using x=>x.BaseClass.PK == 1 and access other properties, it performs the mother SQL generation here too.

I can't use EF4 as I'm limited to the .net 2.0 runtimes with 3.5 SP1 installed.

Has anyone got any experience of getting out of this mess?

Xenogenesis answered 9/1, 2011 at 23:3 Comment(6)
i can't speak for pre EF4, but in that blog the statement "The Table-Per-Type inheritance is 100% unusable." is a bit ridiculous - at least in EF4. It does have some performance problems yes - but if you have 20 sub types, are you sure you have performed correct database normalization? Can those 20 sub types not be split up into seperate parent entities?Shuffleboard
It seems that other ORMs handle the TPT design without a problem. With regards to the blog, it states that when implementing a database/class design that has a large number of types, the time it takes to generate the SQL before execution is greater than several minutes, I believe that is enough to back up the statement of it being unusable in a production environment. Think of a Vehicle catalogue, with common info stored about every vehicle (the base table/class) and then unique info stored about planes/trains/auto-mobiles (sub table/class) - I'm not dealing with vehicles, but you get the ideaXenogenesis
...and there > 20 different vehicle types being stored, before you even start going into vehicle subtypes and all the lookup values.Xenogenesis
No you are not the last. Just reached here.Chronon
A manager tried to torpedo our project citing this post, with the release of EF6 MS fixed these and related issues, both in the tooling and the runtime performance resulting in TPT actually achieving greater runtime performance than TPC or TPH. Most of it was fixed in EF4, OPs issue was more an issue with early EF, not with the TPT concept itself.Morph
I'm hoping between the .net-3.5 tag and the fact this question was asked 11 years ago, you were able to rule this post out as evidence against the more recent releases of EF for your project.Xenogenesis
D
2

This seems a bit confused. You're talking about TPH, but when you say:

The way I see it, I should be able to add a reference to a Stored Procedure from within EF (probably using EFExtensions) that has the the TSQL that selects only the fields I need, even using the code generated by EF for the monster UNION/JOIN inside the SP would prevent the SQL being generated every time a call is made - not something I would intend to do, but you get the idea.

Well, that's Table per Concrete Class mapping (using a proc rather than a table, but still, the mapping is TPC...). The EF supports TPC, but the designer doesn't. You can do it in code-first if you get the CTP.

Your preferred solution of using a proc will cause performance problems if you restrict queries, like this:

var q = from c in Context.SomeChild
        where c.SomeAssociation.Foo == foo
        select c;

The DB optimizer can't see through the proc implementation, so you get a full scan of the results.

So before you tell yourself that this will fix your results, double-check that assumption.

Note that you can always specify custom SQL for any mapping strategy with ObjectContext.ExecuteStoreQuery.

However, before you do any of this, consider that, as RPM1984 points out, your design seems to overuse inheritance. I like this quote from NHibernate in Action

[A]sk yourself whether it might be better to remodel inheritance as delegation in the object model. Complex inheritance is often best avoided for all sorts of reasons unrelated to persistence or ORM. [Your ORM] acts as a buffer between the object and relational models, but that doesn't mean you can completely ignore persistence concerns when designing your object model.

Denunciatory answered 10/1, 2011 at 16:25 Comment(2)
Thanks Craig, I was only referring to TPH as a last resort - what I'm actually trying to achieve is TPT - I guess I don't consider 1 base class/table and 1 further level of subclasses/tables to be complex inheritance. But then perhaps you're referring to the breadth of my class design being too "widely" complex (~23 subclasses) - although I can't seem to get my head around a cleaner way of persisting these classes.Xenogenesis
Oh, and I was referring to the SP method of querying as an alternative to your suggestion with the ObjectContext.ExecuteStoreQuery - Again, I was just hoping that I had missed something glaringly obvious that one of you wise folk could have pointed me at.Xenogenesis
G
2

We've hit this same problem and are considering porting our DAL from EF4 to LLBLGen because of this.

In the meantime, we've used compiled queries to alleviate some of the pain:

Compiled Queries (LINQ to Entities)

This strategy doesn't prevent the mammoth queries, but the time it takes to generate the query (which can be huge) is only done once.

You'll can use compiled queries with Includes() as such:

static readonly Func<AdventureWorksEntities, int, Subcomponent> subcomponentWithDetailsCompiledQuery = CompiledQuery.Compile<AdventureWorksEntities, int, Subcomponent>(
       (ctx, id) => ctx.Subcomponents
            .Include("SubcomponentType")
            .Include("A.B.C.D")
            .FirstOrDefault(s => s.Id == id));

    public Subcomponent GetSubcomponentWithDetails(int id)
    {
        return subcomponentWithDetailsCompiledQuery.Invoke(ObjectContext, id);
    }
Genic answered 14/1, 2011 at 6:22 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.