php-fpm7.1 mmap/munmap (very) slow performance on virtualized systems (hugepage)
Asked Answered
C

3

8

My php-fpm process is facing performance issues on Ubuntu 14.04 LTS (Nginx server, MariaDB database).

strace -f $(pidof php-fpm7.1 | sed 's/\([0-9]*\)/\-p \1/g')

Gave me

<... epoll_wait resumed> {}, 1, 1000) = 0
[pid 32533] epoll_wait(8, {}, 1, 103)   = 0
[pid 32533] epoll_wait(8,  <unfinished ...>
[pid 32535] mmap(NULL, 2097152, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fd933fdd000
[pid 32535] munmap(0x7fd933fdd000, 2097152) = 0
[pid 32535] mmap(NULL, 4190208, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fd933dde000
[pid 32535] munmap(0x7fd933dde000, 139264) = 0
[pid 32535] munmap(0x7fd934000000, 1953792) = 0
[pid 32535] madvise(0x7fd933e00000, 2097152, MADV_HUGEPAGE) = 0
[pid 32533] <... epoll_wait resumed> {}, 1, 897) = 0
[pid 32533] epoll_wait(8, {}, 1, 1000)  = 0
[pid 32533] epoll_wait(8, {}, 1, 1000)  = 0
[pid 32533] epoll_wait(8,  <unfinished ...>
[pid 32535] mmap(NULL, 2097152, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fd933c00000
[pid 32535] madvise(0x7fd933c00000, 2097152, MADV_HUGEPAGE) = 0
[pid 32533] <... epoll_wait resumed> {}, 1, 1000) = 0
[pid 32533] epoll_wait(8, {}, 1, 1000)  = 0
[pid 32533] epoll_wait(8, {}, 1, 1000)  = 0
[pid 32533] epoll_wait(8, {}, 1, 1000)  = 0
[pid 32533] epoll_wait(8,  <unfinished ...>
[pid 32535] mmap(NULL, 2097152, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fd933a00000
[pid 32535] madvise(0x7fd933a00000, 2097152, MADV_HUGEPAGE) = 0
[pid 32533] <... epoll_wait resumed> {}, 1, 1000) = 0
[pid 32533] epoll_wait(8,  <unfinished ...>
[pid 32535] open("/usr/share/zoneinfo/UTC", O_RDONLY) = 7
[pid 32535] fstat(7, {st_mode=S_IFREG|0644, st_size=118, ...}) = 0
[pid 32535] read(7, "TZif2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 20) = 20
[pid 32535] lseek(7, 0, SEEK_SET)       = 0
[pid 32535] mmap(NULL, 118, PROT_READ, MAP_SHARED, 7, 0) = 0x7fd946835000
[pid 32535] close(7)                    = 0
[pid 32535] munmap(0x7fd946835000, 118) = 0
[pid 32535] pwrite(5, "_sf2_attributes|a:2:{s:14:\"_secu"..., 979, 0) = 979
[pid 32535] close(5)                    = 0
[pid 32535] mmap(NULL, 2097152, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fd933200000
[pid 32535] madvise(0x7fd933200000, 2097152, MADV_HUGEPAGE) = 0

I tried with php-fpm7.0, PHPMod7.1 but same issues.

CPU is up to 100% on requests with a small amount of data.

Configurations are the default ones.

On a duplicated instance php5.6-fpm works well.

Edit: Possibly related PHP script keeps doing mmap/munmap

Edit: I tried to enable hugepages https://wiki.debian.org/Hugepages A cat /proc/meminfo | grep Huge gave me

AnonHugePages:    108544 kB
HugePages_Total:     512
HugePages_Free:      497
HugePages_Rsvd:       50
HugePages_Surp:        0
Hugepagesize:       2048 kB

but still the same issue.

Edit: I tried to enable/disable OPCache, also set opcache.huge_code_pages=0, no results. There is no documentation about hugepages on http://php.net/

Calvaria answered 10/7, 2017 at 13:42 Comment(8)
Now happens on Ubuntu 16.04 LTS.Calvaria
What is the hypervisor used?Eveleen
Do you know if PHP is calling mmap()/munmap() directly, or is it through malloc()/free()? If it's through malloc()/free(), you can try setting the M_MMAP_THRESHOLD environment variable to the equivalent of something like 4MB or 8MB so the process uses "normal" heap memory for the 2MB allocations you're seeing. See the mallopt() man page for details.Asbestosis
@WassimDhif ESXICalvaria
@AndrewHenle Thanks, I'll try it but I'm pretty sure it uses mmap/munmap() directly.Calvaria
Cf developers-club.com/posts/270685Calvaria
@EtienneGautier Be aware that developers-club.com/posts/270685 is pretty poor code. Among other problems, it casts a void * to an unsigned long to try aligning the resulting pointer. A void * doesn't necessarily fit into a long. The linked code will fail on 64-bit Windows, for example, hardly an uncommon platform.. See en.wikipedia.org/wiki/64-bit_computing#64-bit_data_modelsAsbestosis
You right, thanks for the information.Calvaria
B
3

I'm not 100% sure if we are facing the same issue here, but it's the closest thing I could find in my search of Stack Overflow for my issue.

My findings were on a virtual machine with the following specs.

CPUs: 16

RAM: 16GB

OS: Ubuntu 16.04.4 LTS

Containers: LXD with ZFS filesystem using block devices

PHP Version: 7.1

I'm running a few LXD containers running MariaDB, Nginx, and PHP-FPM 7.1. These were dev environments, so I was only traffic hitting the server. The application running was built with the Symfony framework.

I was noticing extremely slow page load times. I would consistently wait over a minute for the page to load in dev mode. I didn't spend as much time confirming this as well, but CLI scripts also felt quite slow. I tried tweaking all kinds of PHP settings (realpath caches, opcache on/off/various settings, etc) but nothing worked.

I could consistently strace one of the PHP-FPM processes and see one system call that appeared to be slow. All the other calls would breeze by, but it was getting stuck on the following call many times throughout the process's lifespan.

madvise(0x7f4dcca00000, 2097152, MADV_HUGEPAGE) = 0

Long story short, I was able to drastically change the performance of this application by disabling THP. I ran the following commands on the LXD host and the page load times changed like night and day.

sysctl -w vm.nr_hugepages=0
echo never > /sys/kernel/mm/transparent_hugepage/enabled

I know that Redis recommends disabling THP because of performance issues related to copy-on-write. I also know ZFS filesystems also does copy-on-write, so maybe this issue is related?

Basile answered 4/4, 2018 at 14:10 Comment(2)
I knew it! I couldn't be alone facing this problem, this is too big. Unfortunately I don't work on this environment anymore, I'm afraid I do not have time to reproduce this behavior. By the way the application was built with Symfony as well.Calvaria
Same here, ZFS, LXC, and Symfony. Also having a large database and memory-hungry java application working. Confirmed on Linux 4.9 and 4.19. it took like 1 week to show up after reboot.Chan
L
1

What is the value selected in brackets when you do

cat /sys/kernel/mm/transparent_hugepage/enabled 

Make sure that it is either madvise (as mmaps request Transparent Huge Pages explicitly by MADV_HUGEPAGE) or always.

The next step would be to see if your kernel is not too busy doing compaction of pages (THP has to find contiguous chunk of memory of 2MB for a huge page, which can be difficult if physical memory is fragmented). Have a look at numbers at:

cat /proc/vmstat |grep compact

Before and after your run. Did they grow?

Next step would be to catch the kernel part of the stack of your process:

cat /proc/YOUR_PROCESS_PID/stack

Other useful commands are :

cat /proc/buddyinfo

to see the physical memory fragmentation. Look for the last two columns. They are free chunks of memory in sizes of 2MB and 4MB. Another one :

cat /proc/pagetypeinfo

and look for unmovable page blocks.

Lorrinelorry answered 29/10, 2017 at 12:43 Comment(1)
Thanks for the reply, I cannot make real life tests yet but I will let you know when I can.Calvaria
S
0

Disabling transparent huge page support on the host worked for me:

echo never > /sys/kernel/mm/transparent_hugepage/enabled

In my case the problem happened when running composer in an LXD container on ZFS backend.

Swaddle answered 31/12, 2018 at 16:2 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.