Entity Framework - Eager load two many-to-many relationships
Asked Answered
C

3

8

Sorry for this being so long, but at least I think I got all info to be able to understand and maybe help?

I would like to load data from my database using eager loading.

The data is set up in five tables, setting up two Levels of m:n relations. So there are three tables containing data (ordered in a way of hierarchy top to bottom):

CREATE TABLE [dbo].[relations](
    [relation_id] [bigint] NOT NULL
)

CREATE TABLE [dbo].[ways](
    [way_id] [bigint] NOT NULL
)

CREATE TABLE [dbo].[nodes](
    [node_id] [bigint] NOT NULL,
    [latitude] [int] NOT NULL,
    [longitude] [int] NOT NULL
)

The first two really only consist of their own ID (to hook other data not relevant here into).

In between these three data tables are two m:n tables, with a sorting hint:

CREATE TABLE [dbo].[relations_ways](
    [relation_id] [bigint] NOT NULL,
    [way_id] [bigint] NOT NULL,
    [sequence_id] [smallint] NOT NULL
)

CREATE TABLE [dbo].[ways_nodes](
    [way_id] [bigint] NOT NULL,
    [node_id] [bigint] NOT NULL,
    [sequence_id] [smallint] NOT NULL
)

This is, essentially, a part of the OpenStreetMap data structure. I let Entity Framework build it's objects from this database and it set up the classes exactly as the tables are. The m:n tables do really exist as class. (I understand in EF you can build your objects m:n relation without having the explicit in-between class - should I try to change the object model in this way?)




What I want to do: My entry point is exactly one item of relation.

I think it would be best to first eager load the middle m:n relation, and then in a loop iterate over that and eager load the lowest one. I try to do that in the following way

IQueryable<relation> query = context.relations;
query = query.Where( ... ); // filters down to exactly one
query = query.Include(r => r.relation_members);
relation rel = query.SingleOrDefault();

That loads the relation and all it's 1:n info in just one trip to the database - ok, good. But I noticed it only loads the 1:n table, not the middle data table "ways".

This does NOT change if I modify the line like so:

query = query.Include(r => r.relation_members.Select(rm => rm.way));

So I cannot get the middle level loaded here, it seems?

What I cannot get working at all is load the node level of data eagerly. I tried the following:

foreach (relation_member rm in rel.relation_members) {
    IQueryable<way_node> query = rm.way.way_nodes.AsQueryable();
    query = query.Include(wn => wn.node);
    query.Load();
}

This does work and eagerly loads the middle level way and all 1:n info of way_node in one statement for each iteration, but not the Information from node (latitude/longitude). If I access one of these values I trigger another trip to the database to load one single node object.

This last trip is deadly, since I want to load 1 relation -> 300 ways which each way -> 2000 nodes. So in the end I am hitting the server 1 + 300 + 300*2000... room for improvment, I think.

But how? I cannot get this last statement written in valid syntax AND eager loading. Out of interest; is there a way to load the whole object graph in one trip, starting with one relation?

Chavannes answered 11/1, 2014 at 17:1 Comment(0)
P
8

Loading the whole graph in one roundtrip would be:

IQueryable<relation> query = context.relations;
query = query.Where( ... ); // filters down to exactly one
query = query.Include(r => r.relation_members
    .Select(rm => rm.way.way_nodes
        .Select(wn => wn.node)));
relation rel = query.SingleOrDefault();

However, since you say that the Include up to ...Select(rm => rm.way) didn't work it is unlikely that this will work. (And if it would work the performance possibly isn't funny due to the complexity of the generated SQL and the amount of data and entities this query will return.)

The first thing you should investigate further is why .Include(r => r.relation_members.Select(rm => rm.way)) doesn't work because it seems correct. Is your model and mapping to the database correct?

The loop to get the nodes via explicit loading should look like this:

foreach (relation_member rm in rel.relation_members) {
    context.Entry(rm).Reference(r => r.way).Query()
        .Include(w => w.way_nodes.Select(wn => wn.node))
        .Load();
}
Parasiticide answered 11/1, 2014 at 18:33 Comment(3)
Both solutions, full load as well as iterative partial, work prefect as shown here, no Change needed. Thanks very much for it. I have three new properties to learn now: .Entry .Reference .Query<br />Interestingly, full load is roughly 100 times faster than step-by-step. Moving much redundant data across the network does hurt far less then issuing a lot of new small queries for small resultset. Maybe if the second query was prepared to save recompiling?Chavannes
Regarding eager loading the way entity I cannot get it to work. I think all mappings are ok, they are autogenerated from database and I did not change them. Can it be that they do not get loaded because the entity consists only of it's own ID with no other fields? The id is known when the relations other side is loaded, so there is not much sense in joining another table for a number I already know?! Doesn't really matter - all works as I'd like it to.Chavannes
@Ralf: If you use EF 5 and .NET 4.5 the query is only compiled once and then cached by EF automatically. So EF wouldn't have to recompile the queries in the loop. For .NET 4.0 however the queries are not cached. About the entity with only ID property: Well, that would be interesting if EF "thinks" it is not required because it doesn't contain additional information. I don't think it's the reason why eager loading didn't work, but who knows... :)Parasiticide
T
1

Include() for some reason sometimes gets ignored when there is sorting/grouping/joining involved.

In most cases you can rewrite an Include() as a Select() into an anonymous intermediary object:

Before:

context.Invoices
  .Include(invoice => invoice .Positions)
  .ToList();

After:

context.Invoices
  .Select(invoice  => new {invoice, invoice.Positions})
  .AsEnumerable()
  .Select(x => x.invoice)
  .ToList();

This way the query never should loose Include() information.

Ten answered 25/9, 2015 at 8:10 Comment(0)
C
-1
//get an associate book to an author
var datatable = _dataContext.Authors
                            .Where(x => authorids.Contains(x.AuthorId))
                            .SelectMany(x => x.Books)
                            .Distinct();
Conveyance answered 22/11, 2019 at 20:34 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.