Guaranteed file deletion upon program termination (C/C++)
Asked Answered
L

8

21

Win32's CreateFile has FILE_FLAG_DELETE_ON_CLOSE, but I'm on Linux.

I want to open a temporary file which will always be deleted upon program termination. I could understand that in the case of a program crash it may not be practical to guarantee this, but in any other case I'd like it to work.

I know about RAII. I know about signals. I know about atexit(3). I know I can open the file and delete it immediately and the file will remain accessible until the file descriptor is closed (which even handles a crash). None of these seem like a complete and straightforward solution:

  1. RAII: been there, done that: I have an object whose destructor deletes the file, but the destructor is not called if the program is terminated by a signal.
  2. signals: I'm writing a low-level library which makes registering a signal handler a tricky proposition. For example, what if the application uses signals itself? I don't want to step on any toes. I might consider some clever use of sigaction(2) to cope...but haven't put enough thought into this possibility yet.
  3. atexit(3): apparently useless, since it isn't called during abnormal termination (e.g. via a signal).
  4. preemptive unlink(2): this is pretty good except that I need the file to remain visible in the filesystem (otherwise the system is harder to monitor/troubleshoot).

What would you do here?

Further Explanation

I elided one detail in my original post which I now realize I should have included. The "file" in this case is not strictly a normal file, but rather is a POSIX Message Queue. I create it via mq_open(). It can be closed via mq_close() or close() (the former is an alias for the latter on my system). It can be removed from the system via mq_unlink(). All of this makes it analogous to a regular file, except that I cannot choose the directory in which the file resides. This makes the current most popular answer (placing the file in /tmp) unworkable, because the "file" is created by the system in a virtual filesystem with very limited capacity. (I've mounted the virtual filesystem in /dev/mqueue, following the example in man mq_overview) .

This also explains why I need the name to remain visible (making the immediate-unlink approach unworkable): the "file" must be shared between two or more processes.

Laticialaticiferous answered 22/1, 2009 at 23:53 Comment(3)
It is item 4 (keeping the name accessible) that makes it tough.Outpour
Every time an important detail is left out, the answers go awry. You'll know for next time.Outpour
Could you bypass the need for the file to be accessible within the filesystem by reinventing open(...), i.e., providing some other way an arbitrary process could obtain a file descriptor that refers to your file? You could implement something like this, or even this.Dusk
O
7

The requirement that the name remains visible while the process is running makes this hard to achieve. Can you revisit that requirement?

If not, then there probably isn't a perfect solution. I would consider combining a signal handling strategy with what Kamil Kisiel suggests. You could keep track of the signal handlers installed before you install your signal handlers. If the default handler is SIG_IGN, you wouldn't normally install your own handler; if it is SIG_DFL, you would remember that; if it is something else - a user-defined signal handler - you would remember that pointer, and install your own. When your handler was called, you'd do whatever you need to do, and then call the remembered handler, thus chaining the handlers. You would also install an atexit() handler. You would also document that you do this, and the signals for which you do it.

Note that signal handling is an imperfect strategy; SIGKILL cannot be caught, and the atexit() handler won't be called, and the file will be left around.

David Segond's suggestion - a temporary file name daemon - is interesting. For simple processes, it is sufficient; if the process requesting the temporary file forks and expects the child to own the file thereafter (and exits) then the daemon has a problem detecting when the last process using it dies - because it doesn't automatically know the processes that have it open.

Outpour answered 23/1, 2009 at 2:52 Comment(3)
I do not believe I can remove the requirement for the filename visibility. I added "Further Explanation" to the original question which explains that this is not a regular file, and it lives in a fixed location with very limited capacity. I think I need to have the files visible for monitoring.Laticialaticiferous
Of course, if there is some other way to achieve my goals without the name visibility requirement, I could do that instead. Basically I need a decent way to keep the files from piling up, because they consume a very limited resource (imagine a filesystem with 16MB capacity and ~200KB files).Laticialaticiferous
Oh, and let me not forget, the names need to remain visible during normal operation because they are shared between programs. I can't just unlink them immediately after their creation--that would render them useless. I should have been explicit about that originally.Laticialaticiferous
B
6

If you're just making a temporary file, just create it in /tmp or a subdirectory thereof. Then make a best effort to remove it when done through atexit(3) or similar. As long as you use unique names picked through mkstemp(3) or similar even if it fails to be deleted because of a program crash, you don't risk reading it again on subsequent runs or other such conditions.

At that point it's just a system-level problem of keeping /tmp clean. Most distros wipe it on boot or shutdown, or run a regular cronjob to delete old files.

Busily answered 23/1, 2009 at 0:0 Comment(1)
This is no fault of yours, but I can't use this solution. Please see my "Further Explanation" in the question. I cannot choose the path for the file, and the filesystem in which it resides has very limited size. If I am left with no other choice I might well sweep it with cron-an ugly workaround.Laticialaticiferous
B
4

Maybe someone suggested this already, but I'm unable to spot it, given all your requirements, the best I can think of is to have the filename somehow communicated to a parent process, such as a start-script, which will clean up after the process dies, had it failed to do so. This is perhaps mostly known as a watchdog, but then with the more common use case added to kill and/or restart the process when it somehow fails.

If your parent process dies as well, you're pretty much out of luck, but most script environments are fairly robust and rarely die unless the script is broken, which is often easier to keep correct than a program.

Beutner answered 23/1, 2009 at 14:21 Comment(2)
Actually, this is a pretty good idea - except it is a low-level library that's being written. The library initialization routine create the file and then fork. The child goes on to do all the real work. The parent simply sits there, waiting for the child to terminate, then removes the file.Outpour
This is tricky to justify in a general purpose set of routines; the program may have its own requirements on process structure. If it is permissible - you have enough control of the clients - then it would be pretty effective.Outpour
M
3

In the past, I have build a "temporary file manager" that kept track of temporary files.

One would request a temporary file name from the manager and this name was registered.

Once you don't need the temporary file name any more, you inform the manager and the filename is unregistered.

Upon receipt of a termination signal, all the registered temporary files were destroyed.

Temporary filenames were UUID based to avoid collisions.

Mascot answered 23/1, 2009 at 0:2 Comment(4)
Complex, but clearly it works - doubly so if the temporary file manager has a way to detect whether the process that requested the temporary file still exists. Gets a bit tricky if that process forks; doubly so if the parent exits.Outpour
You could add a requirement that when a process forks, the parent must inform the manager of the pid of the child process. This also allows the flexibility of other types of inter-process comm. to be used to pass the filename of the tmp file around.Obligee
David, your answer sounds as if this "manager" lives as a module in the same process. Jonathan's comment makes it sound like it would be a separate process. I can see how it could work as a separate process, sure, but were you suggesting it could be in-process? If so I don't see as much value....Laticialaticiferous
The temporary file manager was in-process. This works well in most cisrcumstances but I agree that it is not fullproof.Mascot
A
2

You could have the process fork after creating the file, and then wait on the child to close, and then the parent can unlink the file and exit.

Authorized answered 23/1, 2009 at 14:19 Comment(0)
C
2

Do you really need the name to remain visible?

Suppose you take the option of immediately unlinking the file. Then:

  • preemptive unlink(2): this is pretty good except that I need the file to remain visible in the filesystem (otherwise the system is harder to monitor/troubleshoot).

    You can still debug on a deleted file, since it will still be visible under /proc/$pid/fd/. As long as you know the pids of your processes, enumerating their open files should be easy.

  • the names need to remain visible during normal operation because they are shared between programs.

    You can still share the deleted open file between processes by passing around the file descriptor over Unix domain sockets. See Portable way to pass file descriptor between different processes for more information.

Cloudcapped answered 14/8, 2012 at 0:35 Comment(0)
M
1

I just joined stackoverflow and found you here :)

If you're problem is to manage mq files and keep them from piling up, you don't really need to guarantee file deletion upon termination. If you just wanted to useless files from piling up, than keeping a journal may be all you need. Add an entry to the journal file after a mq is opened, another entry when it is closed, and when your library is initialized, check for inconsistency in the journal and take whatever action needed to correct the inconsistency. If you worry about crashing when mq_open/mq_close is being called, you can also add an journal entry just before those functions are called.

Myogenic answered 28/4, 2009 at 16:45 Comment(0)
H
1
  • Have a book-keeping directory for temporary files under your dot-directory.
  • When creating a temp-file, first create book-keeping file into the book-keeping directory that contains path or UUID to your to-be temp file.
  • Create that temp file.
  • When temp-file is deleted, then delete the book-keeping file.
  • When the program starts, scan the book-keeping directory for any files containing paths to temporary files and try to delete them if found, them delete book-keeping files.
  • (Log noisily if any step fails.)

I don't see ways to do it any way simpler. This is the boilerplate any production quality program must go through; +500 lines easily.

Horthy answered 5/8, 2015 at 6:48 Comment(1)
I like this. How can you deal with the case where multiple instances of the program exist at the same time? On startup, if I see a bookkeeping file, I won't know if that is from a previously crashed program (and therefore should be deleted), or from a currently running program that's doing just fine (and therefore should definitely not be deleted).Geber

© 2022 - 2024 — McMap. All rights reserved.