How would I check how busy the HDD is with PHP?
Asked Answered
P

1

9

I've noticed that some cloud hosting solutions have really poor Disk IO. This causes a few problems that could be solved by having the script wait until the disk was less busy.

With PHP is it possible to monitor the busy (or not so busy) state of the filesystem without making things worse?

Polybasite answered 1/8, 2016 at 16:19 Comment(4)
Well, you certainly can fire all sorts of system utilities and evaluate their output to get any information you could also draw as a human. However I doubt that this really helps in the scenario you describe. The "hard disk" you see in a virtualized system is only simulated. So the utilities might show some information, but the question is how much truth is in that. The poor performance in such scenarios is not within the systems hardware (it is virtual anyway), but within the whole networked cluster offering all the services which is something you cannot control or predict.Hallowmas
I'd say get a better provider or a better offer if you experience issues with your current solution. There are huge differences between different providers. Often the less well known providers offer a much better performance than well known companies.Hallowmas
I should add that I am no longer on the project (and thankful not to be). The system had an ungodly lag on HDD reads. Caching to the HDD instead of the DB actually caused the connections to time out. It was the worst platform I ever worked with. I ended up storing config vars in the DB because it was faster to get them that way.Polybasite
Well, of course it is faster to access values in a database than from a disk based file system. At least from a server based database, not something like SQLite.Hallowmas
A
17

If this is a Linux system, you can calculate disk usage yourself - the language you choose to implement it in will use the same concepts.

Your kernel is most likely using sysfs which makes a lot of information about your system available at /sys; we can grab information about our desired disks at a regular interval, and calculate usage based on the differences between them.

On my system I will be looking at the disk, sda, yours may differ.

$ cat /sys/class/block/sda/stat
   42632       25  2045318   247192  6956543  7362278 123236256 23878974        0  3703033 24119492

Now if we look at the Kernel documentation for /sys/class/block/<dev>/stat we can see the following descriptions for each column of the output.

Name            units         description
----            -----         -----------
read I/Os       requests      number of read I/Os processed
read merges     requests      number of read I/Os merged with in-queue I/O
read sectors    sectors       number of sectors read
read ticks      milliseconds  total wait time for read requests
write I/Os      requests      number of write I/Os processed
write merges    requests      number of write I/Os merged with in-queue I/O
write sectors   sectors       number of sectors written
write ticks     milliseconds  total wait time for write requests
in_flight       requests      number of I/Os currently in flight
io_ticks        milliseconds  total time this block device has been active
time_in_queue   milliseconds  total wait time for all requests

If we run this on a cron schedule, and diff some of the wait times, we can see just how long we are waiting on each operation. You will also have other stats about total IOPS, and RW bandwidth. The documentation goes more in depth on each field.

Whatever language is chosen, the file descriptor to open to get information about the disk will be

/sys/class/block/<dev>/stat

If we do this on a schedule, we can draw fancy graphs ;)

enter image description here

Accept answered 5/6, 2017 at 23:41 Comment(7)
This answer is so great I was almost sorry to not get a chance to try it out. Then I remembered how glad I was to no longer be on that project.Polybasite
Sorry to revive the post; saw it on meta and had to share my graphs :D But no matter the project, graphs are always a good thing to have to be able to identify issues! Good luck! ( I use Grafana ) Sorry I did not see the post sooner!Accept
Hey, no problem. You solved my worry about what to do with the question. It is useful now so, that's a win.Polybasite
Using df would be too easy, agreed.Courtyard
That is incorrect, unless there is a switch I do not know about, df will only show space consumption, nothing about current activity.Accept
Another Linux commandline tool that could be helpful is iostat.Tittup
@StephenC good call - but this is only if you have access to a bash shell, which on some shared hosting providers, you do not. Further, if we look at the source of iostat, we can see that it, itself, is reading from this same file mentioned in this post :)Accept

© 2022 - 2024 — McMap. All rights reserved.