LINQ Method Syntax with INNER and OUTER Join
Asked Answered
L

3

5

I have 3 classes and trying to use LINQ methods to perform an INNER JOIN and a LEFT JOIN. I'm able to perform each separately, but no luck together since I can't even figure out the syntax.

Ultimately, the SQL I'd write would be:

SELECT *
FROM [Group] AS [g]
INNER JOIN [Section] AS [s] ON [s].[GroupId] = [g].[Id]
LEFT OUTER JOIN [Course] AS [c] ON [c].[SectionId] = [s].[Id]

Classes

public class Group {
    public int Id { get; set; }
    public int Name { get; set; }
    public bool IsActive { get; set; }
    public ICollection<Section> Sections { get; set; }
}

public class Section {
    public int Id { get; set; }
    public int Name { get; set; }
    public int GroupId { get; set; }
    public Group Group { get; set; }
    public bool IsActive { get; set; }
    public ICollection<Course> Courses { get; set; }
}

public class Course {
    public int Id { get; set; }
    public int UserId { get; set; }
    public int Name { get; set; }
    public int SectionId { get; set; }
    public bool IsActive { get; set; }
}

Samples

I want the result to be of type Group. I successfully performed the LEFT JOIN between Section and Course, but then I have an object of type IQueryable<a>, which is not what I want, sinceGroup`.

var result = db.Section
               .GroupJoin(db.Course, 
                    s => s.Id,
                    c => c.SectionId,
                    (s, c) => new { s, c = c.DefaultIfEmpty() })
               .SelectMany(s => s.c.Select(c => new { s = s.s, c }));

I also tried this, but returns NULL because this performs an INNER JOIN on all tables, and the user has not entered any Courses.

var result = db.Groups
               .Where(g => g.IsActive)
               .Include(g => g.Sections)
               .Include(g => g.Sections.Select(s => s.Courses))
               .Where(g => g.Sections.Any(s => s.IsActive && s.Courses.Any(c => c.UserId == _userId && c.IsActive)))
               .ToList();

Question

How can I perform an INNER and a LEFT JOIN with the least number of calls to the database and get a result of type Group?

Desired Result

I would like to have 1 object of type Group, but only as long as a Group has a Section. I also want to return the Courses the user has for the specific Section or return NULL.

Lizarraga answered 18/7, 2018 at 20:8 Comment(0)
C
3

I think what you ask for is impossible without returning a new (anonymous) object instead of Group (as demonstrated in this answer). EF will not allow you to get a filtered Course collection inside a Section because of the way relations and entity caching works, which means you can't use navigational properties for this task.

First of all, you want to have control over which related entities are loaded, so I suggest to enable lazy loading by marking the Sections and Courses collection properties as virtual in your entities (unless you've enabled lazy loading for all entities in your application) as we don't want EF to load related Sections and Courses as it would load all courses for each user anyway.

public class Group {
    public int Id { get; set; }
    public int Name { get; set; }
    public bool IsActive { get; set; }
    public virtual ICollection<Section> Sections { get; set; }
}

public class Section {
    public int Id { get; set; }
    public int Name { get; set; }
    public int GroupId { get; set; }
    public Group Group { get; set; }
    public bool IsActive { get; set; }
    public virtual ICollection<Course> Courses { get; set; }
}

In method syntax, the query would probably look something like this:

var results = db.Group
    .Where(g => g.IsActive)
    .GroupJoin(
        db.Section.Where(s => s.IsActive),
        g => g.Id,
        s => s.GroupId,
        (g, s) => new
        {
            Group = g,
            UserSections = s
                .GroupJoin(
                    db.Course.Where(c => c.IsActive && c.UserId == _userId).DefaultIfEmpty(),
                    ss => ss.Id,
                    cc => cc.SectionId,
                    (ss, cc) => new
                    {
                        Section = ss,
                        UserCourses = cc
                    }
                )
        })
    .ToList();

And you would consume the result as:

foreach (var result in results)
{
    var group = result.Group;

    foreach (var userSection in result.UserSections)
    {
        var section = userSection.Section;

        var userCourses = userSection.UserCourses;

    }
}

Now, if you don't need additional filtering of the group results on database level, you can as well go for the INNER JOIN and LEFT OUTER JOIN approach by using this LINQ query and do the grouping in-memory:

var results = db.Group
    .Where(g => g.IsActive)
    .Join(
        db.Section.Where(s => s.IsActive),
        g => g.Id,
        s => s.GroupId,
        (g, s) => new
        {
            Group = g,
            UserSection = new
            {
                Section = s,
                UserCourses = db.Course.Where(c => c.IsActive && c.UserId == _userId && c.SectionId == s.Id).DefaultIfEmpty()
            }
        })
    .ToList() // Data gets fetched from database at this point
    .GroupBy(x => x.Group) // In-memory grouping
    .Select(x => new
    {
        Group = x.Key,
        UserSections = x.Select(us => new
        {
            Section = us.UserSection,
            UserCourses = us.UserSection.UserCourses
        })
    });

Remember, whenever you're trying to access group.Sections or section.Courses, you will trigger the lazy loading which will fetch all child section or courses, regardless of _userId.

Candelaria answered 18/7, 2018 at 22:12 Comment(5)
Thank! Looks like a promising answer. I'll work on it and let you know the results. I Actually lazy load (mark virtual on all my properties), which is why my initial sample code uses .Include(...) to fetch them right away. Does that affect your answer?Lizarraga
Just wanted to make sure you had lazy loading enabled. Also, if you only need read access, use db.Group.AsNoTracking() instead of db.Group.Candelaria
o.0 Thanks! Didn't know about .AsNoTracking(). I assume this saves some server resources? Or what else is it good for?Lizarraga
It improves query speed greatly as it tells EF not to track the returned entities for changes.Candelaria
Thanks. I have not yet tested it. I Will test it soon and let you know the results.Lizarraga
G
3

Use DefaultIfEmpty to perform an outer left join

from g in db.group
join s in db.section on g.Id equals s.GroupId 
join c in db.course on c.SectionId equals s.Id into courseGroup
from cg in courseGroup.DefaultIfEmpty()
select new { g, s, c }; 
Gaspar answered 18/7, 2018 at 20:14 Comment(3)
Is there a way to do this with LINQ Methods instead of LINQ Query?Lizarraga
This is true, but I like to learn different methods and wanted a LINQ Methods version. =) I guess I should have stated I can already do it with LINQ Query =)Lizarraga
As I understand, c can't be used in select and cg should be used: select new { g, s, cg };Sissel
C
3

I think what you ask for is impossible without returning a new (anonymous) object instead of Group (as demonstrated in this answer). EF will not allow you to get a filtered Course collection inside a Section because of the way relations and entity caching works, which means you can't use navigational properties for this task.

First of all, you want to have control over which related entities are loaded, so I suggest to enable lazy loading by marking the Sections and Courses collection properties as virtual in your entities (unless you've enabled lazy loading for all entities in your application) as we don't want EF to load related Sections and Courses as it would load all courses for each user anyway.

public class Group {
    public int Id { get; set; }
    public int Name { get; set; }
    public bool IsActive { get; set; }
    public virtual ICollection<Section> Sections { get; set; }
}

public class Section {
    public int Id { get; set; }
    public int Name { get; set; }
    public int GroupId { get; set; }
    public Group Group { get; set; }
    public bool IsActive { get; set; }
    public virtual ICollection<Course> Courses { get; set; }
}

In method syntax, the query would probably look something like this:

var results = db.Group
    .Where(g => g.IsActive)
    .GroupJoin(
        db.Section.Where(s => s.IsActive),
        g => g.Id,
        s => s.GroupId,
        (g, s) => new
        {
            Group = g,
            UserSections = s
                .GroupJoin(
                    db.Course.Where(c => c.IsActive && c.UserId == _userId).DefaultIfEmpty(),
                    ss => ss.Id,
                    cc => cc.SectionId,
                    (ss, cc) => new
                    {
                        Section = ss,
                        UserCourses = cc
                    }
                )
        })
    .ToList();

And you would consume the result as:

foreach (var result in results)
{
    var group = result.Group;

    foreach (var userSection in result.UserSections)
    {
        var section = userSection.Section;

        var userCourses = userSection.UserCourses;

    }
}

Now, if you don't need additional filtering of the group results on database level, you can as well go for the INNER JOIN and LEFT OUTER JOIN approach by using this LINQ query and do the grouping in-memory:

var results = db.Group
    .Where(g => g.IsActive)
    .Join(
        db.Section.Where(s => s.IsActive),
        g => g.Id,
        s => s.GroupId,
        (g, s) => new
        {
            Group = g,
            UserSection = new
            {
                Section = s,
                UserCourses = db.Course.Where(c => c.IsActive && c.UserId == _userId && c.SectionId == s.Id).DefaultIfEmpty()
            }
        })
    .ToList() // Data gets fetched from database at this point
    .GroupBy(x => x.Group) // In-memory grouping
    .Select(x => new
    {
        Group = x.Key,
        UserSections = x.Select(us => new
        {
            Section = us.UserSection,
            UserCourses = us.UserSection.UserCourses
        })
    });

Remember, whenever you're trying to access group.Sections or section.Courses, you will trigger the lazy loading which will fetch all child section or courses, regardless of _userId.

Candelaria answered 18/7, 2018 at 22:12 Comment(5)
Thank! Looks like a promising answer. I'll work on it and let you know the results. I Actually lazy load (mark virtual on all my properties), which is why my initial sample code uses .Include(...) to fetch them right away. Does that affect your answer?Lizarraga
Just wanted to make sure you had lazy loading enabled. Also, if you only need read access, use db.Group.AsNoTracking() instead of db.Group.Candelaria
o.0 Thanks! Didn't know about .AsNoTracking(). I assume this saves some server resources? Or what else is it good for?Lizarraga
It improves query speed greatly as it tells EF not to track the returned entities for changes.Candelaria
Thanks. I have not yet tested it. I Will test it soon and let you know the results.Lizarraga
P
1

Your SQL's type is not [Group] (Type group would be: select [Group].* from ...), anyway if you want it like that, then in its simple form it would be:

var result = db.Groups.Where( g => g.Sections.Any() );

However, if you really wanted to convert your SQL, then:

var result = from g in db.Groups
             from s in g.Sections
             from c in s.Courses.DefaultIfEmpty()
             select new {...};

Even this would do:

var result = from g in db.Groups
             select new {...};

Hint: In a well designed database with relations, you very rarely need to use join keyword. Instead use navigational properties.

Preemie answered 18/7, 2018 at 20:18 Comment(8)
Thanks. Got it. This is very well designed with proper relations IMO, however the reason I need this is because I want all the active Sections, but a student may not have Courses in all Sections. Maybe my knowledge of LINQ is not =) and I might be doing something wrong. Going to test something.Lizarraga
@RoLYroLLs, OK. I suggest you to download and use the wonderful LinqPad (linqPad.net) tool. When you add your connection to it, it would automatically deduct the model and help you in writing the complex Linq.Preemie
Yeah I thought about it. Let me ask you a side-question before I go further: If I write something like var result = db.Groups.Include(g => g.Sections) would it return Groups that do not have Sections assigned yet or would Sections be NULL?Lizarraga
@RoLYroLLs, yes it would. Linqpad is the fastest way to test the idea (and get SQL, lambda calls as well:) rereading your question not sure if I understood correctly, best test it to see yourself.Preemie
Ah!, then I have a totally different question to ask. I thought .Include(...) was like a INNER JOIN, which then would not return any Groups without a Section. doh! I'll do some tests and post a new question.Lizarraga
@RoLYroLLs, I edited my answer really. I am not sure if I understood it right. Please test it to see what it returns. You get the groups, with or without sections.Preemie
Let us continue this discussion in chat.Preemie
Ok, so Include(...) does work like an INNER JOIN, which is why I suspected I needed to do something differently. But I wanted a LINQ Method version and not a LINQ Query version.Lizarraga

© 2022 - 2024 — McMap. All rights reserved.