Row Level Security in Postgres on Normalized Tables
Asked Answered
D

1

8

The premise

In documentation, Row Level Security seems great. Based on what I've read I can now stop creating views like this:

SELECT data.*
FROM data
JOIN user_data
ON data.id = user_data.data_id
AND user_data.role = CURRENT_ROLE

The great part is, Postgres has a great analysis for that view starting with an index scan then a hash join on the user_data table, exactly what we want to happen because it's crazy fast. Compare that with a my RLS implementation:

CREATE POLICY data_owner
ON data
FOR ALL
TO user
USING (
  (
    SELECT TRUE AS BOOL FROM (
      SELECT data_id FROM user_data WHERE user_role = CURRENT_USER
    ) AS user_data WHERE user_data.data_id = data.id
  ) = true
)
WITH CHECK (TRUE);

This bummer of a policy executes the condition for each row in the data table, instead of optimizing by scoping the query to the rows which our CURRENT_USER has access to, like our view does. To be clear, that means select * from data hits every row in the data table.

The question

How do I write a policy with an inner select which doesn't test said select on every row in the target table. Said another way: how do I get RLS to run my policy on the target table before running the actual query on the result?

p.s. I've left this question someone vague and fiddle-less, mostly because sqlfiddle hasn't hit 9.5 yet. Let me know if I need to add more color or some gists to get my question across.

Dialectal answered 3/3, 2017 at 5:51 Comment(1)
I stopped using SQLFiddle because it essentially never works for me. For Postgres I use: rextester.com/l/postgresql_online_compilerCutinize
F
7

PostgreSQL may be able to generate a better plan if you phrase the policy like this:

...
USING (EXISTS
          (SELECT data_id
           FROM user_data
           WHERE user_data.data_id = data.id
             AND role = current_user
          )
      )

You should have a (PRIMARY KEY?) index ON user_data (role, data_id) to speed up nested loop joins.

But I think that it would be a better design to include the permission information in the data table itself, perhaps using the name[] type:

CREATE TABLE data(
   id integer PRIMARY KEY,
   val text,
   acl name[] NOT NULL
);

INSERT INTO data VALUES (1, 'one',   ARRAY[name 'laurenz', name 'advpg']);
INSERT INTO data VALUES (2, 'two',   ARRAY[name 'advpg']);
INSERT INTO data VALUES (3, 'three', ARRAY[name 'laurenz']);

Then you can use a policy like this:

CREATE POLICY data_owner ON data FOR ALL TO PUBLIC
   USING (acl @> ARRAY[current_user::name])
   WITH CHECK (TRUE);
ALTER TABLE data ENABLE ROW LEVEL SECURITY;
ALTER TABLE data FORCE ROW LEVEL SECURITY;

When I SELECT, I get only the rows for which I have permission:

SELECT id, val FROM data;
 id |  val
----+-------
  1 | one
  3 | three
(2 rows)

You can define a GIN index to support that condition:

CREATE INDEX ON data USING gin (acl _name_ops);
Fourposter answered 3/3, 2017 at 9:13 Comment(2)
Is it appropriate to use name[] here? The Postgres docs indicate that designers should not use name[], as it is only for system catalogues. From the docs: "There are two other fixed-length character types in PostgreSQL, shown in Table 8-5. The name type exists only for the storage of identifiers in the internal system catalogs and is not intended for use by the general user." Link here: postgresql.org/docs/9.0/static/datatype-character.htmlDecrease
name should be fine, as only PostgreSQL identifiers are stored. But it is just as well to use text.Fourposter

© 2022 - 2024 — McMap. All rights reserved.