Monitoring if a file stopped writing in python
Asked Answered
W

4

0

I have a program that keeps writing to a file every second. File writing is happening in a thread parallel to the UI. Due to some hardware issue it stops writing sometimes of a day. I wanted to check if the file stopped writing, restart the program if it is not getting updated. I wanted to check the file's timestamp and see if it is not getting updated(and did not want to get to watchdog etc. coz I just needed if a file stopped writing.)

try:
    if time.time()>(os.stat(filename).st_mtime+2):
        raise ValueError("Yikes! Spike")
except ValueError:
    with open('errors.log','a') as log:
        log.write('Spike occured at '+ time.strftime(
        "%H:%M:%S")+' on '+datetime.date.today().strftime('%d/%m/%Y')+'\n')
        log.close()
    restart_program()

This block runs every second. But this backfired and when the app closes for restarting it keeps closing every second and doesn't start again. I get the exception message logged every second. I tried increasing the time difference but that didn't help.

Next I tried

ftimestamp = os.stat(filename).st_mtime
try:
    if os.stat(filename).st_mtime>=ftimestamp:
        ftimestamp = time.time()
        print "ftimestamp updated and all is well"
    else:
        ftimestamp = os.stat(filename).st_mtime
        raise ValueError("Yikes! Spike!")
        print "file time is behind"
except ValueError:
    with open('errors.log','a') as log:
        log.write('Spike occured at '+ time.strftime(
        "%H:%M:%S")+' on '+datetime.date.today().strftime('%d/%m/%Y')+'\n')
        log.close()
    restart_program()

I tried updating the variable "ftimestamp" to current time "time.time()" because the next comparison happens only after one second and I want the file time to be higher than the previous time comparison. (The block runs every second through wx.CallLater function).

My program fails still... and I am not able to understand where I am going wrong... Someone please help! Or is there a way of simply checking if the file stopped writing?

Willing answered 11/7, 2015 at 15:28 Comment(9)
EOFerror ! have you triedHinayana
Is the writing process flushing its output every second?Euphorbia
It might be simpler to just check if the process still exists. You can do a kill(pid,0) with signal 0 for this.Euphorbia
@Hinayana the file is technically always open as it is being written. I am not sure if EOFerror can be of help here...Willing
@Euphorbia yes! it does flush the output every time it is written (every second) which process are you talking about? is there a separate file writing process? I am also checking if the file writing thread is alive BTWWilling
Is there a simple way to check if a file is static...? Instead of timestamps and may be file sizes, is there a better way to do it?Willing
It's a little unclear, your script still runs but doesn't write? So after it fails initially it restarts but won't write to the file?Summertree
@Summertree script runs but due to some hardware issue it stops writing the file occasionally. I want to capture that exception when it stops writing the file. Sometimes the script restarts using the above code of timestamp comparisons. But often the script fails to restart.Willing
If you catch an exception surely that is the time to restart the script. There are also many much better ways to keep a watch on a fileThruster
S
2

We can try checking for a change in file size as a possible solution by doing the following:

import os
from time import sleep
# other imports

while True:
    file1 = os.stat('file.txt') # initial file size
    file1_size = file1.st_size
 
    # your script here that collects and writes data (increase file size)
    sleep(1)
    file2 = os.stat('file.txt') # updated file size
    file2_size = file2.st_size
    comp = file2_size - file1_size # compares sizes
    if comp == 0:
        restart_program()
    else:
        sleep(5)

You may need to adjust the sleep() function accordingly these are just estimates that I'm using since I can't test your actual code. In the end this is infinite loop that will keep running as long as you want the script to keep writing.

Another solution is to update your code to:

import os
import sys
from time import sleep
# other imports

while True:
    file1 = os.stat('file.txt') # initial file size
    file1_size = file1.st_size
 
    # your script here that collects and writes data (increase file size)
    sleep(1)
    file2 = os.stat('file.txt') # updated file size
    file2_size = file2.st_size
    comp = file2_size - file1_size # compares sizes
    if comp == 0:
        sys.exit
    else:
        sleep(5)

Then use a secondary program to run your script as such:

import os
from time import sleep, strftime

while True:
    print(strftime("%H:%M:%S"), "Starting"))
    system('main.py') # this is another infinite loop that will keep your script running
    print(strftime("%H:%M:%S"), "Crashed"))
    sleep(5)
Summertree answered 11/7, 2015 at 16:29 Comment(3)
I don't have the luxury to use sleep in my script because my UI will terribly lag (using wx.CallLater function to set some text field updated!). The last solution seems interesting. How safe it is to run infinite loop script within infinite loop script? How intensive on the system? Let me test this and will get back to u. Thanks for ur effort.Willing
Well.. the way you're putting it infinite loop within infinite loop is very dangerous, but they're controllable in this case. The "secondary program" is in infinite loop as long as main.py is running fine (which is what we want), for the second file it's in an infinite loop until file size is not changing, thus they're both controlled. The secondary program, being the main loop, if terminated will also terminate all other ones.Summertree
The first solution doesn't work for me because the UI hangs, second one doesn't work bcoz the file is getting updated in a thread not sequentially as you assumed. Tried other ways of using file size also to vain. The last one I tried, but the issue is it restarts the application fine but continues to restart indefinitely. Looks like the time stamp exception is happening indefinitely and continues to crash. But thanks once again for your efforts.Willing
D
1

To determine whether the file changes on time in a GUI program, you could use standard tools for your event loop, to run a function every interval seconds e.g., here's how to do it in tkinter:

#!/usr/bin/env python3
import logging
import os
import sys
import tkinter
from datetime import datetime
from time import monotonic as timer, localtime

path = sys.argv[1]
interval = 120 # check every 2 minutes

def check(last=[None]):
    mtime = os.path.getmtime(path) # or os.path.getsize(path)
    logging.debug("mtime %s", datetime.fromtimestamp(mtime))
    if last[0] == mtime: #NOTE: it is always False on the first run
        logging.error("file metadata hasn't been updated, exiting..")
        root.destroy() # exit GUI
    else: # schedule the next run
        last[0] = mtime
        root.after(round(1000 * (interval - timer() % interval)), check)


logging.basicConfig(level=logging.DEBUG,
                    filename=os.path.splitext(__file__)[0] + ".log",
                    format="%(asctime)-15s %(message)s", datefmt="%F %T")
root = tkinter.Tk()
root.withdraw() # hide GUI
root.after(round(1000 * (interval - timer() % interval)), check) # start on boundary
root.mainloop()

You could use supervisord, or systemd, or upstart, etc to respawn your script automatically.

See How to run a function periodically in python.

Dartboard answered 11/7, 2015 at 18:31 Comment(5)
Thank you for your code. I am using similar function to root.after() in wxpython called wx.CallLater(1,some_function()). Since my application is all written in wx, can't go back now! Thanks for your efforts...Willing
@RohinKumar: tkinter is just an example. "use standard tools" suggests that you should use whatever is more appropriate in your GUI framework (wx). If it is wx.CallLater() then use wx.CallLater() -- the program structure (algorithm) won't change. If you don't want to (can't) use a watchdog or its analogs with a push interface then polling on timer is a reasonable option.Dartboard
yes your algorithm can be used with CallLater also. I encountered a little more problem with my program. With a little modification I think I got it to work.Willing
@RohinKumar: note: you don't need to call touch(filename) with the algorithm from my answer: last[0] == mtime is always false the first time check() runs i.e., there are more than interval seconds for the file to change after the restart (you can detect the first run by last[0] is None condition if you want to add special first time actions here e.g., you could double the interval for the first run).Dartboard
I agree! didn't see that...Thank you!Willing
W
1

Finally, after tinkering around with timestamp based options, the following seemed to work for me.

try:
    if time.time()-os.stat(filename).st_mtime>6:
        touch(filename)
        raise ValueError("Yikes! Spike")
except ValueError:
    with open('errors.log','a') as log:
        log.write('Spike/App restarting occured at '+ time.strftime(
                "%H:%M:%S")+' on '+datetime.date.today().strftime('%d/%m/%Y')+'\n')
        log.close()
    restart_program()

Earlier, the problem was it would detect that the file stopped writing with the given time interval and continue to satisfy the same.

time.time()-os.stat(filename).st_mtime>6 

But once this condition is satisfied, unless the file timestamp is updated it continues to satisfy this condition and would keep restarting the program. Now in my solution, I 'touched' the file once (touch used from here) the condition is satisfied and now it works as expected.

Thank you all for your inputs.

Willing answered 13/7, 2015 at 10:12 Comment(2)
When trying to restart the script, it is still stuck in the same loop even after doing 'touch'. not able to restart!Willing
Seems like there is a problem with the way the script is restarted. see #31447942Willing
S
1

A better version of checking and getting file size and wait until file saved :

import os

def get_file_size(file_path, wait_until_file_saved=False):

    file = os.stat(file_path)
    file_size = file.st_size

    if wait_until_file_saved:
        sleep_time = 0.5
        time.sleep(sleep_time)

        while True:
            file = os.stat(file_path)
            curr_file_size = file.st_size

            diff = curr_file_size - file_size  # compares sizes
            if diff == 0:
                break
            else:
                time.sleep(sleep_time)
            file_size = curr_file_size

    # print("file_size : ", file_size)
    return file_size
Sandon answered 17/5, 2024 at 19:23 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.