PostgreSQL query not using INDEX when RLS (Row Level Security) is enabled
Asked Answered
A

2

8

I am using PostgreSQL 10.1, going right to the point...

Lets say I have a TABLE:

CREATE TABLE public.document (
    id uuid PRIMARY KEY,

    title   text,
    content text NOT NULL
);

Together with a GIN INDEX on it:

CREATE INDEX document_idx ON public.document USING GIN(
    to_tsvector(
        'english',
        content || ' ' || COALESCE(title, '')
    )
);

And a basic fulltext search query:

SELECT * FROM public.document WHERE (
    to_tsvector(
        'english',
        content || ' ' || COALESCE(title, '')
    ) @@ plainto_tsquery('english', fulltext_search_documents.search_text)
)

Regardless of the public.document table size, the query is (you already know it) hella fast! The planner uses the INDEX and everything works out great.

Now I introduce some basic access control through RLS (Row Level Security), firstly I enable it:

ALTER TABLE public.document ENABLE ROW LEVEL SECURITY;

and then I add the policy:

CREATE POLICY document_policy ON public.document FOR SELECT
    USING (EXISTS (
        SELECT 1 FROM public.user WHERE (is_current_user) AND ('r' = ANY(privileges))
    ));

To keep things simple the is_current_user is another query which checks exactly that.

Now the fulltext search query is flattened with document_policy query and by doing so the planner executes a Seq Scan instead of Index Scan resulting in a 300x slower query!

I think the question is pretty obvious, how can I fix this so that the fulltext search query stays fast?

Thanks in advance!

Annitaanniversary answered 12/1, 2018 at 16:44 Comment(1)
I think this is a know limitation of RLS.Mainstream
A
11

I have solved this from the time of posting... Anyone facing this issue, this is how I did it:

My solution was to have a private SECURITY DEFINER "wrapper" function containing the propper query and another public function which calls the private one and INNER JOINS the table which requires access control.

So in the specific case above, it would be something like this:

CREATE FUNCTION private.filter_document() RETURNS SETOF public.document AS
$$
    SELECT * FROM public.document WHERE (
        to_tsvector(
            'english',
            content || ' ' || COALESCE(title, '')
        ) @@ plainto_tsquery('english', fulltext_search_documents.search_text)
    )
$$
LANGUAGE SQL STABLE SECURITY DEFINER;
----
CREATE FUNCTION public.filter_document() RETURNS SETOF public.document AS
$$
    SELECT filtered_d.* FROM private.filter_documents() AS filtered_d
        INNER JOIN public.document AS d ON (d.id = filtered_d.id)
$$
LANGUAGE SQL STABLE;

Since I was using Postgraphile (which is super awesome BTW!), I was able to omit introspection of the private schema, making the "dangerous" function inaccessible! With proper security implementations, the end-user will only see the final GraphQL schema, complately removing Postgres from the outside world.

This worked beautifly! Until recently when Postgres 10.3 was released and fixed it, dropping the need for this hack.

On the other hand, my RLS policies are very complex, nested and go really deep. The tables which they are run agains are also quite large (roughly 50,000+ entries to run RLS against in total). Even with super complex and nested policies, I managed to maintain the performance within reasonable boundries.

When working with RLS, keep in mind the following:

  1. Create proper INDEXES
  2. Prefer inline queries everywhere! (Even if that means rewriting the same query N times)
  3. Avoid functions in policies by all means! If you absolutely must have them inside, make sure that they are STABLE and have a high COST (like @mkurtz pointed out); or are IMMUTABLE
  4. Extract the query from the policy, run it directly with EXPLAIN ANALYZE and try optimizing it as much as possible

Hope you guys find the information helpful as much as I did!

Annitaanniversary answered 7/5, 2018 at 15:27 Comment(0)
A
1

Try following: Instead of writing the query into the USING(...) clause put the query into a STABLE function with a very high cost. By doing so the function should not be called very often now - ideally only once per query lifetime, because the cost of calling the function seems now very high to Postgres. Marking the function as STABLE tells Postgres that the result of the function doesn't change during a single query lifetime. I think that is correct for your query, isn't it? Read more about about this two parameters here.

Like this:

CREATE OR REPLACE FUNCTION check_permission () RETURNS BOOLEAN AS $$
    SELECT EXISTS (
        SELECT 1 FROM public.user WHERE (is_current_user) AND ('r' = ANY(privileges))
    )
$$ LANGUAGE SQL STABLE COST 100000;

And the policy now:

CREATE POLICY document_policy ON public.document FOR SELECT
    USING (check_permission());

Hopefully this will give you better performance. But be aware, this only works correctly if it is OK to mark the function as STABLE. If your function could return different results during a single query lifetime then this wouldn't work correctly and you would end up with weird results.

Airline answered 7/5, 2018 at 11:24 Comment(5)
Thanks for the reply @mkurz! I knew about the COST "hack", which is fine when you need small portions for the policies. However (as of Postgres 10.3+ at least) in the example above, it is again more effective to use the inline query (regarding this specific case). And yeah, I need my policies to be super dynamic, so a single high cost function wont work.Annitaanniversary
But on the other hand, I have solved this from the time of posting. My solution was to have a private SECURITY DEFINER "wrapper" function containing the query and another public function which calls the private one and INNER JOINS the table which require access control.Annitaanniversary
@Annitaanniversary Interesting! Could you update your question with your solution including sql snippets (or at another answer)? I am also interested in row level security for our next application and find your approach interesting. Thanks!Airline
Hey @mkurz, you can find an explanation and an example in my answer. Hope it helps you!Annitaanniversary
@Annitaanniversary Great! Thank you very much - I appreciate that you took the time to write this down :)Airline

© 2022 - 2024 — McMap. All rights reserved.