About clustered index in postgres
Asked Answered
V

6

79

I'm using psql to access a postgres database. When viewing the metadata of a table, is there any way to see whether an index of a table is a clustered index?

I heard that the PRIMARY KEY of a table is automatically associated with a clustered index, is it true?

Velocity answered 25/1, 2011 at 17:3 Comment(5)
PostgreSQL doesn't have a clustered index, but why do you think that would be intresting? Do you have a problem to solve?Seasickness
No particular problem to solve. Just learning about postgre :-)Velocity
i would've wondered why my tables didn't automatically defrag if it weren't for this questionCharinile
I think that is true for Microsoft SQL: That a primary key is automatically associated with a clustered index. But apparently not for PostgreSQL.Prissy
SQL Server Clustered Keys are not automatic, they have to be designated as that. Only one Clustered Primary Key per table.Erickaericksen
S
62

Note that PostgreSQL uses the term "clustered index" to use something vaguely similar and yet very different to SQL Server.

If a particular index has been nominated as the clustering index for a table, then psql's \d command will indicate the clustered index, e.g.,

Indexes:
    "timezone_description_pkey" PRIMARY KEY, btree (timezone) CLUSTER

PostgreSQL does not nominate indices as clustering indices by default. Nor does it automatically arrange table data to correlate with the clustered index even when so nominated: the CLUSTER command has to be used to reorganise the table data.

Sweeten answered 25/1, 2011 at 17:17 Comment(2)
more approriate to call 'reorganise table'Distilled
To be precise, Postgres does not use the term "clustered index" at all. There are "clustered tables" - right after CLUSTER has been run.Undercurrent
E
37

In PostgreSQL the clustered attribute is held in the metadata of the corresponding index, rather than the relation itself. It is the indisclustered attribute in pg_index catalogue. Note, however, that clustering relations within postgres is a one-time action: even if the attribute is true, updates to the table do not maintain the sorted nature of the data. To date, automatic maintenance of data clustering remains a popular TODO item.

There is often confusion between clustered and integrated indexes, particularly since the popular textbooks use conflicting names, and the terminology is different again in the manuals of postgres and SQL server (to name just two). When I talk about an integrated index (also called a main index or primary index) I mean one in which the relation data is contained in the leaves of the index, as opposed an external or secondary index in which the leaves contain index entries that point to the table records. The former type is necessarily always clustered. Unfortunately postgres only supports the latter type. Anyhow, the fact that an integrated (primary) index is always clustered may have given rise to the belief that "a PRIMARY KEY of a table is automatically associated with a clustered index". The two statements sound similar, but are different.

Eanes answered 17/10, 2011 at 5:0 Comment(2)
I believe SQL Server also makes primary keys use a clustered index by default unless you say otherwise, right? That may lead to confusion or lead to an assumption that the two necessarily mean the same thing.Prone
Yes, according to msdn.microsoft.com/en-us/library/ms174979.aspx "PRIMARY KEY constraints default to CLUSTERED, and UNIQUE constraints default to NONCLUSTERED."Eanes
K
35

PostgreSQL does not have direct implementation of CLUSTER index like Microsoft SQL Server.

Reference Taken from this Blog:

In PostgreSQL, we have one CLUSTER command which is similar to Cluster Index.

Once you create your table primary key or any other Index, you can execute the CLUSTER command by specifying that Index name to achieve the physical order of the Table Data.

When a table is clustered, it is physically reordered based on the index information. Clustering is a one-time operation: when the table is subsequently updated, the changes are not clustered. That is, no attempt is made to store new or updated rows according to their index order.

Syntax of Cluster:

First time you must execute CLUSTER using the Index Name.

CLUSTER table_name USING index_name;

Cluster the table:

Once you have executed CLUSTER with Index, next time you should execute only CLUSTER TABLE because It knows that which index already defined as CLUSTER.

CLUSTER table_name;
Kanarese answered 3/12, 2016 at 18:25 Comment(0)
S
27

is there any way to see whether an index of a table is a clustered index

PostgreSQL does not have a clustered index, so you won't be able to see them.

I heard that the PRIMARY KEY of a table is automatically associated with a clustered index, is it true?

No, that's not true (see above)

You can manually cluster a table along an index, but this is nothing that will be maintained automatically (as e.g. with SQL Server's clustered indexes).

For more details, see the description of the CLUSTER command in the manual.

Stentor answered 25/1, 2011 at 17:16 Comment(2)
Then can I use \d to tell whether a table is clustered along an index?Velocity
\dS+ should do that if I read the manual correctly. Just check out the help and play around with the optioins.Stentor
D
23

Cluster Indexing

A cluster index means telling the database to store the close values actually close to one another on the disk. They can uniquely identify the rows in the SQL table. Every table can have exactly one one clustered index. A cluster index can cover more than one column. By default, a column with a primary key already has a clustered index.

dictionary

A dictionary itself is a table with clustered index. Because all the data is physically stored in alphabetical order.


Non-Cluster Indexing

Non-clustered indexing is like simple indexing of a book. They are just used for fast retrieval of data. Not sure to have unique data. A non-clustered index contains the non-clustered index keys and their corresponding data location pointer. For example, a book's content index contains the key of a topic or chapter and the page location of that.

book content index

A book's content table holds the content name and its page location. It is not sure that the data is unique. Because same paragraph or text line or word can be placed many times.


PostgreSQL Indexing

PostgreSQL automatically creates indexes for PRIMARY KEY and every UNIQUE constraints of a table. Login to a database in PostgreSQL terminal and type \d table_name. All stored indexes will be visualized. If there is a clustered index then it will also be identified.

Creating a table

CREATE TABLE IF NOT EXISTS profile(
    uid serial NOT NULL UNIQUE PRIMARY KEY,
    username varchar(30) NOT NULL UNIQUE,
    phone varchar(11) NOT NULL UNIQUE,
    age smallint CHECK(age>12),
    address text NULL
);

3 index will be created automatically. All these indexes are non clustered

"profile_pkey" PRIMARY KEY, btree (uid)
"profile_phone_key" UNIQUE CONSTRAINT, btree (phone)
"profile_username_key" UNIQUE CONSTRAINT, btree (username)

Create our own index with uid and username

CREATE INDEX profile_index ON profile(uid, username);

This actually creates a non-clustered index. To make it clustered, run the next part.

Transform a non-clustered index into a clustered one

ALTER TABLE profile CLUSTER ON profile_index;

Check the table with \d profile. It will be like this:

                                     Table "public.profile"
  Column  |         Type          | Collation | Nullable |               Default
----------+-----------------------+-----------+----------+--------------------------------------
 uid      | integer               |           | not null | nextval('profile_uid_seq'::regclass)
 username | character varying(30) |           | not null |
 phone    | character varying(11) |           | not null |
 age      | smallint              |           |          |
 address  | text                  |           |          |
Indexes:
    "profile_pkey" PRIMARY KEY, btree (uid)
    "profile_phone_key" UNIQUE CONSTRAINT, btree (phone)
    "profile_username_key" UNIQUE CONSTRAINT, btree (username)
    "profile_index" btree (uid, username) CLUSTER
Check constraints:
    "profile_age_check" CHECK (age > 12)

Notice that the profile_index is now "CLUSTER"

Now, re-cluster the table so that the table can follow the cluster index role

CLUSTER profile;
Dipper answered 29/3, 2019 at 5:58 Comment(2)
This seems to conflate the different concepts of clustering in PostgreSQL and SQL Server.Sloane
Space wise , non clustered indexes take as much space as the table itself. the CLUSTER command is just a metadata, does not save space as seen in sqlserver.Arnaldo
E
6

If you want to know if a given table is CLUSTERed using SQL, you can use the following query to show the index being used (tested in Postgres versions 9.5 and 9.6):

SELECT
  i.relname AS index_for_cluster
FROM
  pg_index AS idx
JOIN
  pg_class AS i
ON
  i.oid = idx.indexrelid
WHERE
  idx.indisclustered
  AND idx.indrelid::regclass = 'your_table_name'::regclass;
Ezarras answered 24/2, 2017 at 15:32 Comment(2)
@DanielL.VanDenBosch, what was the error? I just rechecked it works for me (and clarified the postgres version in my answer).Ezarras
Great script which I can use to check if my primary key is clustered. This should be marked as correct anwser.Roshan

© 2022 - 2024 — McMap. All rights reserved.