Linux Shell Script: How to detect NFS Mount-point (or the Server) is dead?
Asked Answered
N

3

9

Generally on NFS Client, how to detect the Mounted-Point is no more available or DEAD from Server-end, by using the Bash Shell Script?

Normally i do:

if ls '/var/data' 2>&1 | grep 'Stale file handle';
then
   echo "failing";
else
   echo "ok";
fi

But the problem is, when especially the NFS Server is totally dead or stopped, even the, ls command, into that directory, at Client-side is hanged or died. Means, the script above is no more usable.

Is there any way to detect this again please?

Nutlet answered 12/7, 2013 at 9:44 Comment(0)
I
11

"stat" command is a somewhat cleaner way:

statresult=`stat /my/mountpoint 2>&1 | grep -i "stale"`
if [ "${statresult}" != "" ]; then
  #result not empty: mountpoint is stale; remove it
  umount -f /my/mountpoint
fi

Additionally, you can use rpcinfo to detect whether the remote nfs share is available:

rpcinfo -t remote.system.net nfs > /dev/null 2>&1
if [ $? -eq 0 ]; then
  echo Remote NFS share available.
fi

Added 2013-07-15T14:31:18-05:00:

I looked into this further as I am also working on a script that needs to recognize stale mountpoints. Inspired by one of the replies to "Is there a good way to detect a stale NFS mount", I think the following may be the most reliable way to check for staleness of a specific mountpoint in bash:

read -t1 < <(stat -t "/my/mountpoint")
if [ $? -eq 1 ]; then
   echo NFS mount stale. Removing... 
   umount -f -l /my/mountpoint
fi

"read -t1" construct reliably times out the subshell if stat command hangs for some reason.

Added 2013-07-17T12:03:23-05:00:

Although read -t1 < <(stat -t "/my/mountpoint") works, there doesn't seem to be a way to mute its error output when the mountpoint is stale. Adding > /dev/null 2>&1 either within the subshell, or in the end of the command line breaks it. Using a simple test: if [ -d /path/to/mountpoint ] ; then ... fi also works, and may preferable in scripts. After much testing it is what I ended up using.

Added 2013-07-19T13:51:27-05:00:

A reply to my question "How can I use read timeouts with stat?" provided additional detail about muting the output of stat (or rpcinfo) when the target is not available and the command hangs for a few minutes before it would time out on its own. While [ -d /some/mountpoint ] can be used to detect a stale mountpoint, there is no similar alternative for rpcinfo, and hence use of read -t1 redirection is the best option. The output from the subshell can be muted with 2>&-. Here is an example from CodeMonkey's response:

mountpoint="/my/mountpoint"
read -t1 < <(stat -t "$mountpoint" 2>&-)
if [[ -n "$REPLY" ]]; then
  echo "NFS mount stale. Removing..."
  umount -f -l "$mountpoint"
fi

Perhaps now this question is fully answered :).

Iceland answered 14/7, 2013 at 22:43 Comment(3)
I utilized the stale NFS mount point detection in my script nfs_automount, now available on GitHub.Iceland
Nice answer. I have seen read -t1 < <(stat -t "$MOUNT_DIR" 2>&-) provide a return value of 142. So doing [ ! $? -eq 0 ] as a test is probably better.Intyre
Another point: read -t1 < <(stat -t "$mountpoint" 2>&-) will leave an open file handle (or similar) to the mounted folder. Hence the mount option will fail if you don't use the -l flag. You can use timeout 1 stat -t "$mountpoint" > /dev/null instead. This will kill that stat command and hence kill it's open file handle.Intyre
F
3

The final answers give by Ville and CodeMonkey are almost correct. I'm not sure how no one noticed this, but a $REPLY string having content is a success, not a failure. Thus, an empty $REPLY string means the mount is stale. Thus, the conditional should use -z, not -n:

mountpoint="/my/mountpoint"
read -t1 < <(stat -t "$mountpoint" 2>&-)
if [ -z "$REPLY" ] ; then
  echo "NFS mount stale. Removing..."
  umount -f -l "$mountpoint"
fi

I have ran this multiple times with a valid and invalid mount point and it works. The -n check gave me reverse results, echoing the mount was stale when it was absolutely valid.

Also, the double bracket isn't necessary for a simple string check.

Frankel answered 17/5, 2016 at 13:12 Comment(0)
F
1

Building off the answers here, I found some issues in testing that would output bad data due to how the $REPLY var would get updated (or not, if the result was empty), and the inconsistency of the stat command as provided in the answers.

This uses the stat command to check the FS type which responds to changes pretty fast or instant, and checks the contents of $REPLY to make sure the fs is NFS [ ref: https://unix.stackexchange.com/questions/20523/how-to-determine-what-filesystem-a-directory-exists-on ]

read -t1 < <(timeout 1 stat -f -c %T "/mnt/nfsshare/"  2>&-);if [[ ! "${REPLY}" =~ "nfs" ]];then echo "NFS mount NOT WORKING...";fi
Finnougrian answered 8/9, 2022 at 19:56 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.