Suppose I want to query a big table with a few WHERE
filters. I am using Postgres 11 and a foreign table; foreign data wrapper (FDW) is clickhouse_fdw
. But I am also interested in a general solution.
I can do so as follows:
SELECT id,c1,c2,c3 from big_table where id=3 and c1=2
My FDW is able to do the filtering on the remote foreign data source, ensuring that the above query is quick and doesn't pull down too much data.
The above works the same if I write:
SELECT id,c1,c2,c3 from big_table where id IN (3,4,5) and c1=2
I.e all of the filtering is sent downstream.
However, if the filtering I'm trying to do is slightly more complex:
SELECT bt.id,bt.c1,bt.c2,bt.c3
from big_table bt
join lookup_table l on bt.id=l.id
where c1=2 and l.x=5
then the query planner decides to filter on c1=2
remotely but apply the other filter locally.
In my use case, calculating which id
s have l.x=5
first and then sending those off to be filtered remotely will be much quicker, so I tried to write it the following way:
SELECT id,c1,c2,c3
from big_table
where c1=2
and id IN (select id from lookup_table where x=5)
However, the query planner still decides to perform the second filter locally on ALL of the results from big_table
that satisfy c1=2
, which is very slow.
Is there some way I can "force" (select id from lookup_table where x=5)
to be pre-calculated and sent as part of a remote filter?