Create a unique index on a non-unique column
Asked Answered
C

1

4

Not sure if this is possible in PostgreSQL 9.3+, but I'd like to create a unique index on a non-unique column. For a table like:

CREATE TABLE data (
  id SERIAL
  , day DATE
  , val NUMERIC
);
CREATE INDEX data_day_val_idx ON data (day, val); 

I'd like to be able to [quickly] query only the distinct days. I know I can use data_day_val_idx to help perform the distinct search, but it seems this adds extra overhead if the number of distinct values is substantially less than the number of rows in the index covers. In my case, about 1 in 30 days is distinct.

Is my only option to create a relational table to only track the unique entries? Thinking:

CREATE TABLE days (
  day DATE PRIMARY KEY
);

And update this with a trigger every time we insert into data.

Cowles answered 20/3, 2015 at 16:40 Comment(0)
S
5

An index can only index actual rows, not aggregated rows. So, yes, as far as the desired index goes, creating a table with unique values like you mentioned is your only option. Enforce referential integrity with a foreign key constraint from data.day to days.day. This might also be best for performance, depending on the complete situation.

However, since this is about performance, there is an alternative solution: you can use a recursive CTE to emulate a loose index scan:

WITH RECURSIVE cte AS (
   (  -- parentheses required
   SELECT day FROM data ORDER BY 1 LIMIT 1
   )
   UNION ALL
   SELECT (SELECT day FROM data WHERE day > c.day ORDER BY 1 LIMIT 1)
   FROM   cte  c
   WHERE  c.day IS NOT NULL  -- exit condition
   )
SELECT day FROM cte;

Parentheses around the first SELECT are required because of the attached ORDER BY and LIMIT clauses. See:

This only needs a plain index on day.

There are various variants, depending on your actual queries:

More in my answer to your follow-up querstion:

Stridor answered 20/3, 2015 at 17:59 Comment(8)
Going to need to benchmark this against the DISTINCT I'm already running and the referenced table mentioned but this is solid. Again, you da man!Cowles
@Justin: Would be interested in the results. Maybe you could drop a comment here later? Or even an answer with the details of your case ...Stridor
Having some trouble getting the recursive query to run properly.. not sure I fully understand how it's recursing. Does the cte recurse on the FROM call before or after joining with data? Posted the other two queries in my question.Cowles
@Justin: I suggest you put all of that in a new question (and revert the edit to the current question), to keep this clean. You can always link to this one for reference. The added WHERE condition is issue here ...Stridor
Fair enough. I think I need to read up on the recursive cte before posting another question. The spirit of this question was to make sure I wasn't missing something "easy" and you I think you've got pointed in the right direction for now. Thanks!Cowles
@Justin: Reading up is never wrong. I already have an idea or two for you, if you post the case as new question.Stridor
Posted #29178780Cowles
@Justin: Found a performance bug. Consider the update. More in the answer to your follow-up question.Stridor

© 2022 - 2024 — McMap. All rights reserved.