How do you kill all Linux processes that are older than a certain age?
Asked Answered
S

14

69

I have a problem with some zombie-like processes on a certain server that need to be killed every now and then. How can I best identify the ones that have run for longer than an hour or so?

Seventieth answered 8/8, 2008 at 16:50 Comment(2)
In Linux use killall -i --older-than 1h someprocessnameCampbellite
Or see my answer which uses pgrep and is thus more flexible than killall.Isidor
U
36

If they just need to be killed:

if [[ "$(uname)" = "Linux" ]];then killall --older-than 1h someprocessname;fi

If you want to see what it's matching

if [[ "$(uname)" = "Linux" ]];then killall -i --older-than 1h someprocessname;fi

The -i flag will prompt you with yes/no for each process match.

Uranian answered 9/5, 2012 at 23:50 Comment(4)
Good tip. I stumbled upon the --older-than switch later on, but the -i makes it useful for checking before killing who knows what.Seventieth
What's the point of if [[ "$(uname)" = "Linux" ]];? Isn't the relevant portion just the killall command? (It seems that the surrounding if clause could be removed to make this answer a little more direct)Moussaka
@ringo Because on some systems (e.g. Solaris), killall is an entirely different command. On Solaris, it terminates all commands.Kiely
Note that using killall's '--regex' option causes '--older-than' to be ignored. Joy!Hereditary
S
36

Found an answer that works for me:

warning: this will find and kill long running processes

ps -eo uid,pid,etime | egrep '^ *user-id' | egrep ' ([0-9]+-)?([0-9]{2}:?){3}' | awk '{print $2}' | xargs -I{} kill {}

(Where user-id is a specific user's ID with long-running processes.)

The second regular expression matches the a time that has an optional days figure, followed by an hour, minute, and second component, and so is at least one hour in length.

Seventieth answered 8/8, 2008 at 17:3 Comment(6)
Umm, are you killing the process? I hope people realize this code doesn't just find, but also kills, or they might get upset.Appomattox
@ButtleButkus Good point. Yes, the whole reason for the question was to find the old processes and kill them, but it's not explicitly mentioned all that clearly. Note to others: ignore the last little bit of the line unless you enjoy angry calls from users.Seventieth
wtf! please change the title of the question. luckily i din't own the process!Tract
You can use option etimes instead of etime to always display the elapsed time in seconds and not days/hours...Twopiece
@Twopiece that sounds like just the ticket. Must be in a more recent version of ps that I'm running though, as it's not listed as an option.Seventieth
@Seventieth you are right. I have just checked it and ps v3.2.8 from debian squeeze does not support the etimes parameter, however v3.3.3 from debian wheezy does.Twopiece
U
36

If they just need to be killed:

if [[ "$(uname)" = "Linux" ]];then killall --older-than 1h someprocessname;fi

If you want to see what it's matching

if [[ "$(uname)" = "Linux" ]];then killall -i --older-than 1h someprocessname;fi

The -i flag will prompt you with yes/no for each process match.

Uranian answered 9/5, 2012 at 23:50 Comment(4)
Good tip. I stumbled upon the --older-than switch later on, but the -i makes it useful for checking before killing who knows what.Seventieth
What's the point of if [[ "$(uname)" = "Linux" ]];? Isn't the relevant portion just the killall command? (It seems that the surrounding if clause could be removed to make this answer a little more direct)Moussaka
@ringo Because on some systems (e.g. Solaris), killall is an entirely different command. On Solaris, it terminates all commands.Kiely
Note that using killall's '--regex' option causes '--older-than' to be ignored. Joy!Hereditary
D
22

For anything older than one day,

ps aux

will give you the answer, but it drops down to day-precision which might not be as useful.

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.0   7200   308 ?        Ss   Jun22   0:02 init [5]
root         2  0.0  0.0      0     0 ?        S    Jun22   0:02 [migration/0]
root         3  0.0  0.0      0     0 ?        SN   Jun22   0:18 [ksoftirqd/0]
root         4  0.0  0.0      0     0 ?        S    Jun22   0:00 [watchdog/0]

If you're on linux or another system with the /proc filesystem, In this example, you can only see that process 1 has been running since June 22, but no indication of the time it was started.

stat /proc/<pid>

will give you a more precise answer. For example, here's an exact timestamp for process 1, which ps shows only as Jun22:

ohm ~$ stat /proc/1
  File: `/proc/1'
  Size: 0               Blocks: 0          IO Block: 4096   directory
Device: 3h/3d   Inode: 65538       Links: 5
Access: (0555/dr-xr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2008-06-22 15:37:44.347627750 -0700
Modify: 2008-06-22 15:37:44.347627750 -0700
Change: 2008-06-22 15:37:44.347627750 -0700
Dominic answered 8/8, 2008 at 16:56 Comment(4)
It seems that ps and stat is showing different results for me. ps shows that the process started 1 day ago and stat shows that started today. Why?Catawba
Note that the TIME column in ps output does not show the actual run time of a process. It shows the accumulated CPU time of the process - the time the CPUs did work with the process.Muna
thanks, stat /proc/<pid> gave me exact result for process lifetime (startup time)Valenevalenka
I'll also note in passing that if the process was started in the previous year, the START column lists just the year. Not very much precision. at all. In addition to this, when I stat'd the proc/id of a long-running tmux session from last year, it reported back a date from this year. Solution I went with: ps -eo pid,etime | grep $PIDJhelum
V
9

In this way you can obtain the list of the ten oldest processes:

ps -elf | sort -r -k12 | head -n 10
Valaree answered 8/8, 2008 at 17:20 Comment(1)
Actually that gives you the 10 latest processes because by default it shows STIME which is the time the program started. If it was showing ETIME which is the elapsed time since the program started then this would be correct.Tonitonia
I
8

Jodie C and others have pointed out that killall -i can be used, which is fine if you want to use the process name to kill. But if you want to kill by the same parameters as pgrep -f, you need to use something like the following, using pure bash and the /proc filesystem.

#!/bin/sh                                                                                                                                               

max_age=120 # (seconds)                                                                                                                                 
naughty="$(pgrep -f offlineimap)"                                                                                                                       
if [[ -n "$naughty" ]]; then # naughty is running                                                                                                       
  age_in_seconds=$(echo "$(date +%s) - $(stat -c %X /proc/$naughty)" | bc)                                                                              
  if [[ "$age_in_seconds" -ge "$max_age" ]]; then # naughty is too old!                                                                                 
    kill -s 9 "$naughty"                                                                                                                                
  fi                                                                                                                                                    
fi     

This lets you find and kill processes older than max_age seconds using the full process name; i.e., the process named /usr/bin/python2 offlineimap can be killed by reference to "offlineimap", whereas the killall solutions presented here will only work on the string "python2".

Isidor answered 14/5, 2013 at 15:54 Comment(2)
This is what I needed. Six years later, bash 5.0.3 is complaining about the double brackets. The first double brackets work as singles, and so do the second ones with unquoting of the variables. Then it worked for me. Thanks.Scheming
I guess the shebang should be #!/bin/bash, to be able to use bash-only features like [[Myosotis
M
7

Perl's Proc::ProcessTable will do the trick: http://search.cpan.org/dist/Proc-ProcessTable/

You can install it in debian or ubuntu with sudo apt-get install libproc-processtable-perl

Here is a one-liner:

perl -MProc::ProcessTable -Mstrict -w -e 'my $anHourAgo = time-60*60; my $t = new Proc::ProcessTable;foreach my $p ( @{$t->table} ) { if ($p->start() < $anHourAgo) { print $p->pid, "\n" } }'

Or, more formatted, put this in a file called process.pl:

#!/usr/bin/perl -w
use strict;
use Proc::ProcessTable;
my $anHourAgo = time-60*60;
my $t = new Proc::ProcessTable;
foreach my $p ( @{$t->table} ) {
    if ($p->start() < $anHourAgo) {
        print $p->pid, "\n";
    }
}

then run perl process.pl

This gives you more versatility and 1-second-resolution on start time.

Melanism answered 13/8, 2010 at 7:19 Comment(0)
L
3

You can use bc to join the two commands in mob's answer and get how many seconds ellapsed since the process started:

echo `date +%s` - `stat -t /proc/<pid> | awk '{print $14}'` | bc

edit:

Out of boredom while waiting for long processes to run, this is what came out after a few minutes fiddling:

#file: sincetime
#!/bin/bash
init=`stat -t /proc/$1 | awk '{print $14}'`
curr=`date +%s`
seconds=`echo $curr - $init| bc`
name=`cat /proc/$1/cmdline`
echo $name $seconds

If you put this on your path and call it like this: sincetime

it will print the process cmdline and seconds since started. You can also put this in your path:

#file: greptime
#!/bin/bash
pidlist=`ps ax | grep -i -E $1 | grep -v grep | awk '{print $1}' | grep -v PID | xargs echo`
for pid in $pidlist; do
    sincetime $pid
done

And than if you run:

greptime <pattern>

where patterns is a string or extended regular expression, it will print out all processes matching this pattern and the seconds since they started. :)

Lexicology answered 14/6, 2012 at 23:52 Comment(0)
C
2

do a ps -aef. this will show you the time at which the process started. Then using the date command find the current time. Calculate the difference between the two to find the age of the process.

Castiglione answered 24/10, 2009 at 2:49 Comment(1)
Unfortunately the time output here is difficult to parse. It can be "HH:MM" for short-running processes, or "MonDD" (possibly localized!) or even just the year for very long-running processes.Volvox
P
1

I did something similar to the accepted answer but slightly differently since I want to match based on process name and based on the bad process running for more than 100 seconds

kill $(ps -o pid,bsdtime -p $(pgrep bad_process) | awk '{ if ($RN > 1 && $2 > 100) { print $1; }}')
Philanthropist answered 11/8, 2011 at 3:15 Comment(1)
Shouldn't $RN be $NR?Misspell
A
1

stat -t /proc/<pid> | awk '{print $14}'

to get the start time of the process in seconds since the epoch. Compare with current time (date +%s) to get the current age of the process.

Ambulant answered 16/2, 2012 at 17:39 Comment(3)
We can join the two commands to get seconds since the process started: "echo stat -t /proc/<pid> | awk '{print $14}' - date +%s | bc"Lexicology
This won't always be right -- at least not for Linux 2.6 systems. I have a process that started at 9:49 but stat -t (and stat) show that it started at 13:14.Doggett
@dpk: sometimes you have a main process and some forks running. The main process should be 9:49, but the child process can have any more recent time. The same applies to the threads of a process.Hooey
P
0

Using ps is the right way. I've already done something similar before but don't have the source handy. Generally - ps has an option to tell it which fields to show and by which to sort. You can sort the output by running time, grep the process you want and then kill it.

HTH

Pavier answered 8/8, 2008 at 16:54 Comment(0)
L
0

In case anyone needs this in C, you can use readproc.h and libproc:

#include <proc/readproc.h>
#include <proc/sysinfo.h>

float
pid_age(pid_t pid)
{
        proc_t proc_info;
        int seconds_since_boot = uptime(0,0);
        if (!get_proc_stats(pid, &proc_info)) {
                return 0.0;
        }

        // readproc.h comment lies about what proc_t.start_time is. It's
        // actually expressed in Hertz ticks since boot

        int  seconds_since_1970 = time(NULL);
        int time_of_boot = seconds_since_1970 - seconds_since_boot;
        long  t = seconds_since_boot - (unsigned long)(proc_info.start_time / Hertz);

        int delta = t;
        float days = ((float) delta / (float)(60*60*24));
        return days;
}
Leishaleishmania answered 2/4, 2012 at 9:7 Comment(0)
H
0

Came across somewhere..thought it is simple and useful

You can use the command in crontab directly ,

* * * * * ps -lf | grep "user" |  perl -ane '($h,$m,$s) = split /:/,$F
+[13]; kill 9, $F[3] if ($h > 1);'

or, we can write it as shell script ,

#!/bin/sh
# longprockill.sh
ps -lf | grep "user" |  perl -ane '($h,$m,$s) = split /:/,$F[13]; kill
+ 9, $F[3] if ($h > 1);'

And call it crontab like so,

* * * * * longprockill.sh
Hest answered 6/7, 2014 at 9:16 Comment(0)
C
0

My version of sincetime above by @Rafael S. Calsaverini :

#!/bin/bash
ps --no-headers -o etimes,args "$1"

This reverses the output fields: elapsed time first, full command including arguments second. This is preferred because the full command may contain spaces.

Ceremonial answered 26/8, 2018 at 4:29 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.