Error in performance test PostgreSQL and GlusterFS

Asked 20/10, 2017 at 15:11 Answered 2/8, 2020 at 16:0

I'm doing performance test with pgbench to evaluate the impacts of using Glusterfs with Postgresql. I've created a gluster replicated volume with 3 bricks/servers:

Volume Name: gv0
Type: Replicate
Volume ID: a7e617ec-c564-4a01-aec9-807e87fcccb3
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: 10.112.76.37:/export/sdb1/brick
Brick2: 10.112.76.38:/export/sdb1/brick
Brick3: 10.112.76.39:/export/sdb1/brick
Options Reconfigured:
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off

After that I've configured postgres to use the the volume gv0. Everything works fine under low stress. However, when the load is increased, the following error occurs:

client 14 aborted in state 9: ERROR:  unexpected data beyond EOF in block 0 of relation base/16384/16503
HINT:  This has been seen to occur with buggy kernels; consider updating your system.
client 7 aborted in state 9: ERROR:  unexpected data beyond EOF in block 0 of relation base/16384/16503
HINT:  This has been seen to occur with buggy kernels; consider updating your system.
client 5 aborted in state 9: ERROR:  unexpected data beyond EOF in block 0 of relation base/16384/16503
HINT:  This has been seen to occur with buggy kernels; consider updating your system.
client 6 aborted in state 9: ERROR:  unexpected data beyond EOF in block 0 of relation base/16384/16503
HINT:  This has been seen to occur with buggy kernels; consider updating your system.
client 8 aborted in state 9: ERROR:  unexpected data beyond EOF in block 0 of relation base/16384/16503
HINT:  This has been seen to occur with buggy kernels; consider updating your system.
client 0 aborted in state 9: ERROR:  unexpected data beyond EOF in block 0 of relation base/16384/16503
HINT:  This has been seen to occur with buggy kernels; consider updating your system.
client 11 aborted in state 9: ERROR:  unexpected data beyond EOF in block 0 of relation base/16384/16503
HINT:  This has been seen to occur with buggy kernels; consider updating your system.

Any idea of what's causing this?

Conquer answered 20/10, 2017 at 15:11 Comment(1)

Did you disable full_page_writes? – Babby 10/1, 2018 at 7:3

Gluster does not support "structured data", as stated in the GlusterFS Install Guide:

Gluster does not support so called “structured data”, meaning live, SQL databases. Of course, using Gluster to backup and restore the database would be fine - Gluster is traditionally better when using file sizes at of least 16KB (with a sweet spot around 128KB or so).

My guess would be that Gluster can just about keep up with the replication when the load is small, but struggles when the load is increased above a certain point, possibly leading to split-brain errors.

You can view files in split brain with the command gluster volume heal <volume_name> info split-brain, or gluster volume heal <volume_name> info for all the files that need healing.

Placative answered 3/3, 2018 at 17:23 Comment(0)

It's maybe a bugs in GlusterFS

Also Kernel before version 4.13 has issue with detecting errors during fsync() and PostgreSQL applied corresponded changes in recent versions https://www.percona.com/blog/2019/02/22/postgresql-fsync-failure-fixed-minor-versions-released-feb-14-2019/

So it makes sense to recheck this issue on recent versions of Kernel, GlusterFS and PostgreSQL.

Holocaust answered 2/8, 2020 at 16:0 Comment(0)

Recommended topics

Hot tags