Are there any negative performance or functionality downsides to using pg_upgrade with --link option afterwards?

I'm about upgrade a quite large PostgreSQL cluster from 9.3 to 11.

The upgrade

The cluster is approximately 1,2Tb in size. The database has a disk system consisting of a fast HW RAID 10 array of 8 DC-edition SSDs with 192GB ram and 64 cores. I am performing the upgrade by replicating the data to a new server with streaming replication first, then upgrading that one to 11.

I tested the upgrade using pg_upgrade with the --link option, this takes less than a minute. I also tested the upgrade regularly (without --link) with many jobs, that takes several hours (+4).

Questions

Now the obvious choice is of cause for me to use the --link option, however all this makes me wonder - is there any downsides (performance or functionality wise) to using that over the regular slower method? I do not know the internal workings of postgresql data structures, but I have a feeling there could be a performance difference after the upgrade between rewriting the data entirely and to just using hard links - whatever that means?

Considerations

The only thing I can find in the documentation about the drawbacks of --link is the downside of not being able to access the old data directory after the upgrade is performed https://www.postgresql.org/docs/11/pgupgrade.htm However that is only a safety concern and not a performance drawback and doesn't really apply in my case of replicating the data first. The only other thing I can think of is reclaiming space, with whatever performance upsides that might have. However as I understand it, that can also be achieved by running a VACUUM FULL DATABASE (or CLUSTER?) command after the --link-upgraded database has been upgraded? Also the reclaiming of space is not very impactful performance wise on an SSD as I understand.

I appreciate if anyone can help cast some light into this.

Recommended topics

Hot tags