How to optimize execution plan for query with multiple outer joins to huge tables, group by and order by clauses?
Asked Answered
M

8

5

I have the following database (simplified):

CREATE TABLE `tracking` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `manufacture` varchar(100) NOT NULL,
  `date_last_activity` datetime NOT NULL,
  `date_created` datetime NOT NULL,
  `date_updated` datetime NOT NULL,
  PRIMARY KEY (`id`),
  KEY `manufacture` (`manufacture`),
  KEY `manufacture_date_last_activity` (`manufacture`, `date_last_activity`),
  KEY `date_last_activity` (`date_last_activity`),
) ENGINE=InnoDB AUTO_INCREMENT=401353 DEFAULT CHARSET=utf8

CREATE TABLE `tracking_items` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `tracking_id` int(11) NOT NULL,
  `tracking_object_id` varchar(100) NOT NULL,
  `tracking_type` int(11) NOT NULL COMMENT 'Its used to specify the type of each item, e.g. car, bike, etc',
  `date_created` datetime NOT NULL,
  `date_updated` datetime NOT NULL,
  PRIMARY KEY (`id`),
  KEY `tracking_id` (`tracking_id`),
  KEY `tracking_object_id` (`tracking_object_id`),
  KEY `tracking_id_tracking_object_id` (`tracking_id`,`tracking_object_id`)
) ENGINE=InnoDB AUTO_INCREMENT=1299995 DEFAULT CHARSET=utf8

CREATE TABLE `cars` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `car_id` varchar(255) NOT NULL COMMENT 'It must be VARCHAR, because the data is coming from external source.',
  `manufacture` varchar(255) NOT NULL,
  `car_text` text CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,
  `date_order` datetime NOT NULL,
  `date_created` datetime NOT NULL,
  `date_updated` datetime NOT NULL,
  `deleted` tinyint(4) NOT NULL DEFAULT '0',
  PRIMARY KEY (`id`),
  UNIQUE KEY `car_id` (`car_id`),
  KEY `sort_field` (`date_order`)
) ENGINE=InnoDB AUTO_INCREMENT=150000025 DEFAULT CHARSET=utf8

This is my "problematic" query, that runs extremely slow.

SELECT sql_no_cache `t`.*,
       count(`t`.`id`) AS `cnt_filtered_items`
FROM `tracking` AS `t`
INNER JOIN `tracking_items` AS `ti` ON (`ti`.`tracking_id` = `t`.`id`)
LEFT JOIN `cars` AS `c` ON (`c`.`car_id` = `ti`.`tracking_object_id`
                            AND `ti`.`tracking_type` = 1)
LEFT JOIN `bikes` AS `b` ON (`b`.`bike_id` = `ti`.`tracking_object_id`
                            AND `ti`.`tracking_type` = 2)
LEFT JOIN `trucks` AS `tr` ON (`tr`.`truck_id` = `ti`.`tracking_object_id`
                            AND `ti`.`tracking_type` = 3)
WHERE (`t`.`manufacture` IN('1256703406078',
                            '9600048390403',
                            '1533405067830'))
  AND (`c`.`car_text` LIKE '%europe%'
       OR `b`.`bike_text` LIKE '%europe%'
       OR `tr`.`truck_text` LIKE '%europe%')
GROUP BY `t`.`id`
ORDER BY `t`.`date_last_activity` ASC,
         `t`.`id` ASC
LIMIT 15

This is the result of EXPLAIN for above query:

+----+-------------+-------+--------+-----------------------------------------------------------------------+-------------+---------+-----------------------------+---------+----------------------------------------------+
| id | select_type | table |  type  |                             possible_keys                             |     key     | key_len |             ref             |  rows   |                    extra                     |
+----+-------------+-------+--------+-----------------------------------------------------------------------+-------------+---------+-----------------------------+---------+----------------------------------------------+
|  1 | SIMPLE      | t     | index  | PRIMARY,manufacture,manufacture_date_last_activity,date_last_activity | PRIMARY     |       4 | NULL                        | 400,000 | Using where; Using temporary; Using filesort |
|  1 | SIMPLE      | ti    | ref    | tracking_id,tracking_object_id,tracking_id_tracking_object_id         | tracking_id |       4 | table.t.id                  |       1 | NULL                                         |
|  1 | SIMPLE      | c     | eq_ref | car_id                                                                | car_id      |     767 | table.ti.tracking_object_id |       1 | Using where                                  |
|  1 | SIMPLE      | b     | eq_ref | bike_id                                                               | bike_id     |     767 | table.ti.tracking_object_id |       1 | Using where                                  |
|  1 | SIMPLE      | t     | eq_ref | truck_id                                                              | truck_id    |     767 | table.ti.tracking_object_id |       1 | Using where                                  |
+----+-------------+-------+--------+-----------------------------------------------------------------------+-------------+---------+-----------------------------+---------+----------------------------------------------+

What is the problem this query is trying to solve?

Basically, I need to find all records in tracking table that may be associated with records in tracking_items (1:n) where each record in tracking_items may be associated with record in left joined tables. The filtering criteria is crucial part in the query.

What is the problem I have with the query above?

When there's order by and group by clauses the query runs extremely slow, e.g. 10-15 seconds to complete for the above configuration. However, if I omit any of these clauses, the query is running pretty quick (~0.2 seconds).

What I've already tried?

  1. I've tried to used a FULLTEXT index, but it didn't help much, as the results evaluated by the LIKE statemenet are narrowed by the JOINs using indexes.
  2. I've tried to use WHERE EXISTS (...) to find if there are records in left joined tables, but unfortunately without any luck.

Few notes about relations between these tables:

tracking -> tracking_items (1:n)
tracking_items -> cars (1:1)
tracking_items -> bikes (1:1)
tracking_items -> trucks (1:1)

So, I'm looking for a way to optimize that query.

Miner answered 14/8, 2018 at 16:33 Comment(11)
LIKE '%John%' - that can't make use of indexes (search pattern starting with a %). Look into full-text-search instead of using likeDina
"tracking_items -> cars (1:1)" -- can't be. There are far more cars than tracking, which seems backwards.Trackless
What are the PRIMARY KEYs? I may need to see SHOW CREATE TABLE to help you.Trackless
Do your keys really need to be VARCHAR(255) CHARACTER SET utf8?Trackless
What MySQL version are you using? Is the "europe" a fixed part of the search of does that change? Is the '250_d1', '350_d2', '350_d3' a fixed part of the query? Do these three map to cars, bikes and trucks on a 1:1 basis (and if so how)? Its a bit odd both tracking and trucks being aliased to t but seems to have coped.Mazda
@RickJames, PRIMARY KEY in all tables is id column. Unfortunately, I need keys to be VARCHAR(255), because it's a legacy system and I can't change that. I'll provide the result of CREATE TABLE statement. @danblack, I'm using MySQL 5.6, these are not fixed part of the query (I've copied them wrongly, but I've changed them now), instead they're generated in a way to display only these records from t that current user have access to.Miner
Another interesting fact is that if I skip the joining to cars (the biggest table), the query runs much better, however it's still slow.Miner
Is tracking_type also varchar? Please provide SHOW CREATE TABLE for each table.Trackless
Yeah, removing cars will speed things up. Every run needs to scan all of cars, because of ``car_text` like '%europe%'`.Trackless
You probably continued to use OR when you tried FULLTEXT. See my Answer.Trackless
@kav does the car_text, bike_text, truck_text contain 'europe' in the middle part or they start with 'europe'. Sample data from the tables would be helpful.Dark
O
5

Bill Karwin suggests the query might perform better if it used an index with a leading column of manufacture. I second that suggestion. Especially if that's very selective.

I also note that we're doing a GROUP BY t.id, where id is the PRIMARY KEY of the table.

No columns from any tables other than tracking are referenced in the SELECT list.

This suggests we're really only interested in returning rows from t, and not on creating duplicates due to multiple outer joins.

Seems like the COUNT() aggregate has the potential to return an inflated count, if there are multiple matching rows in tracking_item and bikes,cars,trucks. If there's three matching rows from cars, and four matching rows from bikes, ... the COUNT() aggregate is going to return a value of 12, rather than 7. (Or maybe there is some guarantee in the data such that there won't ever be multiple matching rows.)

If the manufacture is very selective, and that returns a reasonably small set of rows from tracking, if the query can make use of an index ...

And since we aren't returning any columns from any tables other than tracking, apart from a count or related items ...

I would be tempted to test correlated subqueries in the SELECT list, to get the count, and filter out the zero count rows using a HAVING clause.

Something like this:

SELECT SQL_NO_CACHE `t`.*
     , ( ( SELECT COUNT(1)
             FROM `tracking_items` `tic`
             JOIN `cars` `c`
               ON `c`.`car_id`           = `tic`.`tracking_object_id`
              AND `c`.`car_text`      LIKE '%europe%'
            WHERE `tic`.`tracking_id`    = `t`.`id`
              AND `tic`.`tracking_type`  = 1
         )
       + ( SELECT COUNT(1)
             FROM `tracking_items` `tib`
             JOIN `bikes` `b`
               ON `b`.`bike_id`          = `tib`.`tracking_object_id` 
              AND `b`.`bike_text`     LIKE '%europe%'
            WHERE `tib`.`tracking_id`    = `t`.`id`
              AND `tib`.`tracking_type`  = 2
         )
       + ( SELECT COUNT(1)
             FROM `tracking_items` `tit`
             JOIN `trucks` `tr`
               ON `tr`.`truck_id`        = `tit`.`tracking_object_id`
              AND `tr`.`truck_text`   LIKE '%europe%'
            WHERE `tit`.`tracking_id`    = `t`.`id`
              AND `tit`.`tracking_type`  = 3
         ) 
       ) AS cnt_filtered_items
  FROM `tracking` `t`
 WHERE `t`.`manufacture` IN ('1256703406078', '9600048390403', '1533405067830')
HAVING cnt_filtered_items > 0
 ORDER
    BY `t`.`date_last_activity` ASC
     , `t`.`id` ASC

We'd expect that the query could make effective use of an index on tracking with leading column of manufacture.

And on the tracking_items table, we want an index with leading columns of type and tracking_id. And including tracking_object_id in that index would mean the query could be satisfied from the index, without visiting the underlying pages.

For the cars, bikes and trucks tables the query should make use of an index with leading column of car_id, bike_id, and truck_id respectively. There's no getting around a scan of the car_text, bike_text, truck_text columns for the matching string... best we can do is narrow down the number rows that need to have that check performed.

This approach (just the tracking table in the outer query) should eliminate the need for the GROUP BY, the work required to identify and collapse duplicate rows.

BUT this approach, replacing joins with correlated subqueries, is best suited to queries where there is a SMALL number of rows returned by the outer query. Those subqueries get executed for every row processed by the outer query. It's imperative that those subqueries to have suitable indexes available. Even with those tuned, there is still potential for horrible performance for large sets.

This does still leave us with a "Using filesort" operation for the ORDER BY.


If the count of related items should be the product of a multiplication, rather than addition, we could tweak the query to achieve that. (We'd have to muck with the return of zeros, and the condition in the HAVING clause would need to be changed.)

If there wasn't a requirement to return a COUNT() of related items, then I would be tempted to move the correlated subqueries from the SELECT list down into EXISTS predicates in the WHERE clause.


Additional notes: seconding the comments from Rick James regarding indexing... there appears to be redundant indexes defined. i.e.

KEY `manufacture` (`manufacture`)
KEY `manufacture_date_last_activity` (`manufacture`, `date_last_activity`)

The index on the singleton column isn't necessary, since there is another index that has the column as the leading column.

Any query that can make effective use of the manufacture index will be able to make effective use of the manufacture_date_last_activity index. That is to say, the manufacture index could be dropped.

The same applies for the tracking_items table, and these two indexes:

KEY `tracking_id` (`tracking_id`)
KEY `tracking_id_tracking_object_id` (`tracking_id`,`tracking_object_id`)

The tracking_id index could be dropped, since it's redundant.

For the query above, I would suggest adding a covering index:

KEY `tracking_items_IX3` (`tracking_id`,`tracking_type`,`tracking_object_id`)

-or- at a minimum, a non-covering index with those two columns leading:

KEY `tracking_items_IX3` (`tracking_id`,`tracking_type`)
Octo answered 21/8, 2018 at 21:16 Comment(1)
Regarding "redundant" indices: Don't forget the implicit primary key part of an InnoDB index. With the primary key (id) - select * from t where a = 1 order by id limit 1 can utilize index(a) for ORDER BY but not index(a, b). This is because index(a) is in fact index(a, id). The same applies to many GROUP BY queries. So it's not correct to say "Any query that can make effective use of the manufacture index will be able to make effective use of the manufacture_date_last_activity index."Gift
H
4

The EXPLAIN shows you are doing an index-scan ("index" in the type column) on the tracking table. An index-scan is pretty much as costly as a table-scan, especially when the index scanned is the PRIMARY index.

The rows column also shows that this index-scan is examining > 355K rows (since this figure is only a rough estimate, it's in fact examining all 400K rows).

Do you have an index on t.manufacture? I see two indexes named in the possible keys that might include that column (I can't be sure solely based on the name of the index), but for some reason the optimizer isn't using them. Maybe the set of values you search for is matched by every row in the table anyway.

If the list of manufacture values is intended to match a subset of the table, then you might need to give a hint to the optimizer to make it use the best index. https://dev.mysql.com/doc/refman/5.6/en/index-hints.html

Using LIKE '%word%' pattern-matching can never utilize an index, and must evaluate the pattern-match on every row. See my presentation, Full Text Search Throwdown.

How many items are in your IN(...) list? MySQL sometimes has problems with very long lists. See https://dev.mysql.com/doc/refman/5.6/en/range-optimization.html#equality-range-optimization

P.S.: When you ask a query optimization question, you should always include the SHOW CREATE TABLE output for each table referenced in the query, so folks who answer don't have to guess at what indexes, data types, constraints you currently have.

Humbug answered 14/8, 2018 at 17:53 Comment(3)
Thanks! Actually, the query above (without ORDER BY) clause is running pretty fast, so I'm wondering - does the LIKE is evaluated over all results of three tables or it's evaluated only on the result set after JOIN? I'm pretty sure it's the seond case, so maybe FULLTEXT index won't help that much, as indexes for JOINs seems to be efficient. However, in most cases I won't have more than 5-10 items in the IN clause.Miner
The joins will be applied using indexes (the EXPLAIN shows that). The LIKE predicates apply only to the rows affected by the joins. So there could be thousands of rows looked up by the join, only to be eliminated because they don't match the LIKE patterns.Humbug
Thanks for your reply! After many tests, it turned out that query is slowed down because of combination of ORDER BY and GROUP BY. If I remove one of them, the query runs pretty quick. I'll keep searching how to optimize the query, I think that I'll have to "rewrite" it in a way trying to avoid all these left JOINs.Miner
G
4

First of all: Your query makes assumptions about string contents, which it shouldn't. What may car_text like '%europe%' indicate? Something like 'Sold in Europe only' maybe? Or Sold outside Europe only? Two possible strings with contradictory meanings. So if you assume a certain meaning once you find europe in the string, then you should be able to introduce this knowledge in the database - with a Europe flag or a region code for instance.

Anyway, you are showing certain trackings with their Europe transportation count. So select trackings, select transportation counts. You can either have the aggregation subquery for transportation counts in your SELECT clause or in your FROM clause.

Subquery in SELECT clause:

select
  t.*,
  (
    select count(*)
    from tracking_items ti
    where ti.tracking_id = t.id
    and (tracking_type, tracking_object_id) in
    (
      select 1, car_id from cars where car_text like '%europe%'
      union all
      select 2, bike_id from bikes where bike_text like '%europe%'
      union all
      select 3, truck_id from trucks where truck_text like '%europe%'
    )
from tracking t
where manufacture in ('1256703406078', '9600048390403', '1533405067830')
order by date_last_activity, id;

Subquery in FROM clause:

select
  t.*, agg.total
from tracking t
left join
(
  select tracking_id, count(*) as total
  from tracking_items ti
  and (tracking_type, tracking_object_id) in
  (
    select 1, car_id from cars where car_text like '%europe%'
    union all
    select 2, bike_id from bikes where bike_text like '%europe%'
    union all
    select 3, truck_id from trucks where truck_text like '%europe%'
  )
  group by tracking_id
) agg on agg.tracking_id = t.id
where manufacture in ('1256703406078', '9600048390403', '1533405067830')
order by date_last_activity, id;

Indexes:

  • tracking(manufacture, date_last_activity, id)
  • tracking_items(tracking_id, tracking_type, tracking_object_id)
  • cars(car_text, car_id)
  • bikes(bike_text, bike_id)
  • trucks(truck_text, truck_id)

Sometimes MySQL is stronger on simple joins than on anything else, so it may be worth a try to blindly join transportation records and only later see whether it's car, bike or truck:

select
  t.*, agg.total
from tracking t
left join
(
  select
    tracking_id,
    sum((ti.tracking_type = 1 and c.car_text like '%europe%')
        or
        (ti.tracking_type = 2 and b.bike_text like '%europe%')
        or
        (ti.tracking_type = 3 and t.truck_text like '%europe%')
       ) as total
  from tracking_items ti
  left join cars c on c.car_id = ti.tracking_object_id
  left join bikes b on c.bike_id = ti.tracking_object_id
  left join trucks t on t.truck_id = ti.tracking_object_id
  group by tracking_id
) agg on agg.tracking_id = t.id
where manufacture in ('1256703406078', '9600048390403', '1533405067830')
order by date_last_activity, id;
Giddy answered 21/8, 2018 at 7:56 Comment(4)
Thank you! I have question re. first example (subquery in FROM clause). How will MySQL use the appropiate indexes for reading records from cars table? IMPO, it will make a full table scan?Miner
Well, that depends from what the optimizer considers appropriate. Yes, it is very likely the DBMS just reads the table sequentially, which is not per se bad; it may be the fastest approach. The DBMS may use the index, as it contains all the Information, but it would have to read the whole index. Such makes only sense when the table in question is very broad, i.e. reading the table would mean to have to read loads of physical blocks from disk. That is rarely the case, so a full table scan is much more likely.Giddy
It's quite interesting for me why the optimizer is executing the query slow when there's GROUP BY clause. However, if I omit that part from the query, it runs really fast in both sorting directions, joining to all tables, etc. Any thoughts on that?Miner
No, not really. GROUP BY is costly; the DBMS must sort all data in order to then recognize rows of the same group. If you then sort again, this becomes even more costly. If there is only one sort operation to do, the DBMS may be able to make some execution plan where it collects the result rows in the final order. I don't know if this is true for your query, though.Giddy
I
2

If my guess is correct and cars, bikes, and trucks are independent from each other (i.e. a particular pre-aggregate result would only have data from one of them). You might be better off UNIONing three simpler sub-queries (one for each).

While you cannot do much index-wise about LIKEs involving leading wildcards; splitting it into UNIONed queries could allow avoid evaluating p.fb_message LIKE '%Europe%' OR p.fb_from_name LIKE '%Europe% for all the cars and bikes matches, and the c conditions for all the b and t matches, and so on.

Inexpugnable answered 14/8, 2018 at 18:17 Comment(2)
Thanks! I've tried this, but the query is still slow. However, I've changed the original question a bit (added more details) and started a bounty. Hopefully will find a quick query to satisfy my DB needs :)Miner
@kav - Uueerdo has several good points; let's see the query you used to implement them. Maybe you missed a step.Trackless
T
2
ALTER TABLE cars ADD FULLTEXT(car_text)

then try

select  sql_no_cache
        `t`.*,  -- If you are not using all, spell out the list
        count(`t`.`id`) as `cnt_filtered_items`  -- This does not make sense
                         -- and is possibly delivering an inflated value
    from  `tracking` as `t`
    inner join  `tracking_items` as `ti`  ON (`ti`.`tracking_id` = `t`.`id`)
    join   -- not LEFT JOIN
         `cars` as `c`  ON `c`.`car_id` = `ti`.`tracking_object_id`
                                     AND  `ti`.`tracking_type` = 1 
    where  `t`.`manufacture` in('1256703406078', '9600048390403', '1533405067830')
      AND  MATCH(c.car_text)  AGAINST('+europe' IN BOOLEAN MODE)
    group by  `t`.`id`    -- I don't know if this is necessary
    order by  `t`.`date_last_activity` asc, `t`.`id` asc
    limit  15;

to see if it will correctly give you a suitable 15 cars.

If that looks OK, then combine the three together:

SELECT  sql_no_cache
        t2.*,
        -- COUNT(*)  -- this is probably broken
    FROM (
        ( SELECT t.id FROM ... cars ... )  -- the query above
        UNION ALL     -- unless you need UNION DISTINCT
        ( SELECT t.id FROM ... bikes ... )
        UNION ALL
        ( SELECT t.id FROM ... trucks ... )
         ) AS u
    JOIN tracking AS t2  ON t2.id = u.id
    ORDER BY t2.date_last_activity, t2.id
    LIMIT 15;

Note that the inner SELECTs only deliver t.id, not t.*.

Anoter index needed:

ti:  (tracking_type, tracking_object_id)   -- in either order

Indexes

When you have INDEX(a,b), you don't also need INDEX(a). (This won't help the query in question, but it will help disk space and INSERT performance.)

When I see PRIMARY KEY(id), UNIQUE(x), I look for any good reason not to get rid of id and change to PRIMARY KEY(x). Unless there is something significant in the 'simplification' of the schema, such a change would help. Yeah, car_id is bulky, etc, but it is a big table and the extra lookup (from index BTree to data BTree) is hurting, etc.

I think it is very unlikely that KEYsort_field(date_order) will ever be used. Either drop it (saving a few GB) or combine it in some useful way. Let's see the query in which you think it might be useful. (Again, a suggestion that is not directly relevant to this Question.)

re Comment(s)

I made some substantive changes to my formulation.

My formulation has 4 GROUP BYs, 3 in the 'derived' table (ie, FROM ( ... UNION ... )), and one outside. Since the outer part is limited to 3*15 rows, I do not worry about performance there.

Further note that the derived table delivers only t.id, then re-probes tracking to get the other columns. This lets the derived table run much faster, but at a small expense of the extra JOIN outside.

Please elaborate on the intent of the COUNT(t.id); it won't work in my formulation, and I don't know what it is counting.

I had to get rid of the ORs; they are the secondary performance killer. (The first killer is LIKE '%...'.)

Trackless answered 21/8, 2018 at 6:23 Comment(2)
Thanks! I've added the SHOW CREATE TABLE ... in the question. I need the GROUP BY, because at the end, I need to display only records from tracking table. I know that's the "expand-collapse" (d)effect, but I can think of better approach to find records in tracking matching a condition in c, b or tr tables. I'll give it a try.Miner
@kav - I added to my Answer.Trackless
C
2

When there's order by and group by clauses the query runs extremely slow, e.g. 10-15 seconds to complete for the above configuration. However, if I omit any of these clauses, the query is running pretty quick (~0.2 seconds).

This is interesting... generally the best optimization technique I know is to make good use of temporary tables, and it sounds like it will work really well here. So you would first create the temporary table:

create temporary table tracking_ungrouped (
    key (id)
)
select sql_no_cache `t`.*
from `tracking` as `t` 
inner join `tracking_items` as `ti` on (`ti`.`tracking_id` = `t`.`id`)
    left join `cars` as `c` on (`c`.`car_id` = `ti`.`tracking_object_id` AND `ti`.`tracking_type` = 1)
    left join `bikes` as `b` on (`b`.`bike_id` = `ti`.`tracking_object_id` AND `ti`.`tracking_type` = 2)    
    left join `trucks` as `tr` on (`tr`.`truck_id` = `ti`.`tracking_object_id` AND `ti`.`tracking_type` = 3)
where 
    (`t`.`manufacture` in('1256703406078', '9600048390403', '1533405067830')) and 
    (`c`.`car_text` like '%europe%' or `b`.`bike_text` like '%europe%' or `tr`.`truck_text` like '%europe%');

and then query it for the results you need:

select t.*, count(`t`.`id`) as `cnt_filtered_items`
from tracking_ungrouped t
group by `t`.`id` 
order by `t`.`date_last_activity` asc, `t`.`id` asc 
limit 15;
Crusted answered 21/8, 2018 at 7:30 Comment(0)
D
2
SELECT t.*
FROM (SELECT * FROM tracking WHERE manufacture 
                IN('1256703406078','9600048390403','1533405067830')) t
INNER JOIN (SELECT tracking_id, tracking_object_id, tracking_type FROM tracking_items
    WHERE tracking_type IN (1,2,3)) ti 
    ON (ti.tracking_id = t.id)
LEFT JOIN (SELECT car_id, FROM cars WHERE car_text LIKE '%europe%') c 
ON (c.car_id = ti.tracking_object_id AND ti.tracking_type = 1)
    LEFT JOIN (SELECT bike_id FROM bikes WHERE bike_text LIKE '%europe%') b 
ON (b.bike_id = ti.tracking_object_id AND ti.tracking_type = 2)
    LEFT JOIN (SELECT truck_id FROM trucks WHERE truck_text LIKE '%europe%') tr 
ON (tr.truck_id = ti.tracking_object_id AND ti.tracking_type = 3)
    ORDER BY t.date_last_activity ASC, t.id ASC

The subqueries perform faster when it comes to join and if they are going to filter out lot of records.

The subquery of tracking table will filter out lot of other unwanted manufacture and results in a smaller table t to be joined.

Similarly applied the condition for the tracking_items table as we are interested in only tracking_types 1,2 and 3; to create a smaller table ti. If there are a lot of tracking_objects, you can even add the tracking object filter in this subquery.

Similar approaches to tables cars, bikes, trucks with their condition for their respective text to contain europe helps us to create smaller tables c,b,tr respectively.

Also removing the group by t.id as t.id is unique and we are performing inner join and left join on that or resulting table, as there is no need.

Lastly I am only selecting the required columns from each of the tables that are necessary, which will also reduce the load on the memory space and also runtime.

Hope this helps. Please let me know your feedback and run statistics.

Dark answered 23/8, 2018 at 19:42 Comment(0)
S
0

I'm not sure it will work, how about applying filter on each table (cars, bikes, and trucks) in ON clause,before joining, it should filter out rows?

Sandlin answered 14/8, 2018 at 18:50 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.