Optimizing sequential scan in PostgreSQL
Asked Answered
M

0

2

PostgreSQL 9.4

In the article about hardware optimization for PostgreSQL server was shown that too much moving disk head may cause performance bottleneck. As far as I got, for the sequential scan, it'd be much faster if the cylinders with blocks of the same table are located as close as possible. So, if we're inserting data into a table, the data will be placed in "sequential order" on a disk-storage, therefore reducing the disk head moving overhead.

But I presume that such location may be break in time. Is there a way for postgreSQL to maintain such "clustering" (if I can say so)?

Mesosphere answered 5/10, 2015 at 7:22 Comment(7)
You can use the cluster command. But that won' t "maintain" the ordering. But any real world DB server will have an several hard disks (RAID) (or even SSDs) and in that case you can't control the ordering of the rows anyway.Lenoir
And I guess that if you're using SSD then the seek time is largely irrelevant. Maybe that's the answer.Advertent
@a_horse_with_no_name I still don't see why such operation doesn't make sense. If a DB has more than 1 hard disks, why isn't it useful to "maintain" physical ordering of the table's data on each drive?Mesosphere
@a_horse_with_no_name Is there a real-world case where we should take the physical ordering into account. Or we can permanently forget about it at all, as I was told thereMesosphere
Because you have no control over where the RAID controller will put the blocks, so they might not be sequential after all. And what exactly does "sequential" mean if you are reading data from say 10 hard disks at the same time? Unless you need to tune some low-end single hard disk server for a very special use case I would simply forget about this.Lenoir
@a_horse_with_no_name And what exactly does "sequential" mean But each of the hard disks has its own head to read the data. So I'd say, sequentially means requireing the least possible time for the disk's head to move from one cylinders to another containing the table's data on each disk.Mesosphere
@a_horse_with_no_name Unless you need to tune some low-end single hard disk server for a very special use case I would simply forget about this. Got it, thank you.Mesosphere

© 2022 - 2024 — McMap. All rights reserved.