I am using MySQL Connector/Net 6.5.4 with LINQ to entities, and I frequently get terrible query performance because the entity framework generates queries that use derived tables.
Here is a simplified example of what I've encountered several times. In C#, I write a query like this:
var culverCustomers = from cs in db.CustomerSummaries where cs.Street == "Culver" select cs;
// later...
var sortedCustomers = culverCustomers.OrderBy(cs => cs.Name).ToList();
Instead of generating simple a query like this:
SELECT cust.id FROM customer_summary cust WHERE cust.street = "Culver" ORDER BY cust.name
The entity framework generates a query with a derived table like this:
SELECT Project1.id FROM (
SELECT cust.id, cust.name, cust.street FROM customer_summary cust
WHERE Project1.street = "Culver"
) AS Project1 -- here is where the EF generates a pointless derived table
ORDER BY Project1.name
If I explain both queries I get this for the first query:
id, select_type, table, type, possible_keys, rows
1, PRIMARY, addr, ALL, PRIMARY, 9
1, PRIMARY, cust, ref, PRIMARY, 4
... and something awful like this for the entity framework query
id, select_type, table, type, possible_keys, rows
1, PRIMARY, <derived2>, ALL, 9639
2, DERIVED, addr, ALL, PRIMARY, 9
2, DERIVED, cust, ref, PRIMARY, 4
Note the first row, where MySQL explains that it's scanning 9000+ records. Because of the derived table, MySQL is creating a temp table and loading every row. (Or so I'm deducing based on articles like this one: Derived Tables and Views Performance)
How can I prevent the Entity Framework from using a derived table, or how can I convince MySQL to do the obvious optimization for queries like this?
For completion, here is the view that is the source for this linq query:
create view customer_summary as
select cust.id, cust.name, addr.street
customers cust
join addresses addr
on addr.customer_id = cust.id