extract last 10 minutes from logfile [duplicate]
Asked Answered
C

6

12

Trying to find a simple way for watching for recent events (from less than 10 minutes), I've tried this:

awk "/^$(date --date="-10 min" "+%b %_d %H:%M")/{p++} p" /root/test.txt

but it doesn't work as expected...

Log files are in form :

Dec 18 09:48:54 Blah
Dec 18 09:54:47 blah bla
Dec 18 09:55:33 sds
Dec 18 09:55:38 sds
Dec 18 09:57:58 sa
Dec 18 09:58:10 And so on...
Callaghan answered 18/12, 2013 at 3:59 Comment(5)
Do you want log messages from the last 10 minutes of actual time or the last 10 minutes relative to the end of the log?Belenbelesprit
This type of stuff gets messy in plain shell -- especially since the date command you have to use for the solution varies from system to system. Can I give you an answer in Perl? If not, I'll give you one that works for BSD (I have a Mac), and you'll have to figure it out for Linux (if that's what you have).Aubyn
@DavidW. bash do offer now a lot of powerfull tips for doing this kind of tricks... (See my answer). Anyway perl stay my first choice for this kind of jobs.Purpleness
@tripleee This is not a real duplicate, as the goal there is to read bunch of time upto end of file. Other question stand for interval of time inside one log file.Purpleness
@FHauri Not having an end condition is just a simpler case of the same problem, no?Trentontrepan
E
7

You can match the date range using simple string comparison, for example:

d1=$(date --date="-10 min" "+%b %_d %H:%M")
d2=$(date "+%b %_d %H:%M")
while read line; do
    [[ $line > $d1 && $line < $d2 || $line =~ $d2 ]] && echo $line
done

For example if d1='Dec 18 10:19' and d2='Dec 18 10:27' then the output will be:

Dec 18 10:19:16
Dec 18 10:19:23
Dec 18 10:21:03
Dec 18 10:22:54
Dec 18 10:27:32

Or using awk if you wish:

awk -v d1="$d1" -v d2="$d2" '$0 > d1 && $0 < d2 || $0 ~ d2'
Enterogastrone answered 18/12, 2013 at 5:45 Comment(3)
What if there are some different months in the timestamps? I think the string comparisons can break down. e.g. [[ "Jan 18 10:27:32" < "Feb 18 10:27:32" ]] && echo "older" and [[ "Feb 18 10:27:32" < "Mar 18 10:27:32" ]] && echo "older" give inconsistent results.Belenbelesprit
@DigitalTrauma True, if the time period covers a change of month, my solution won't work. Your solution is more accurate, and you have my upvote.Enterogastrone
@Enterogastrone New bash version offer a builtin date function by using printf "%(...)T" -1. This is a lot quicker than using a fork to date. For using endianess property of date, you could correct month entry by using associative array. See my answer!Purpleness
S
13

Here is nice tool range is any you wish from -10 till now

sed -n "/^$(date --date='10 minutes ago' '+%b %_d %H:%M')/,\$p" /var/log/blaaaa
Swot answered 23/10, 2014 at 11:4 Comment(3)
its cool. Superfast even when the logs is huge. what if i want xxx.xxx.xxx.xxx - - [25/Oct/2014:22:42:45 +0800] "GET, how to edit the sed command? I tried date --date='1 minutes ago' '+[%d/%b/%Y:%H:%M' but it didn't workCallaghan
My log lines start with 2017-06-28 14:00 so I modified the sed command as follows: sed -n "/^$(date --date="60 minutes ago" "+%Y-%m-%d %H:%M")/,\$p" /var/log/blah.log to get the last 60 minutes of the logTryptophan
sed is superfast, but string comparisons can break down if months differ in timelapse!! This approach is fragile.Purpleness
P
9

Introduction

This answer is something long, because there is 3 different way on thinking: 1) quick or exact, 2) pure and 3) script in function.

That's a (common) job for !:

Simple and efficient:

perl -MDate::Parse -ne 'print if/^(.{15})\s/&&str2time($1)>time-600' /path/log

This version print last 10 minutes event, upto now, by using time function.

You could test this with:

sudo cat /var/log/syslog |
  perl -MDate::Parse -ne '
    print if /^(\S+\s+\d+\s+\d+:\d+:\d+)\s/ && str2time($1) > time-600'

Note that first representation use only firsts 15 chars from each lines, while second construct use more detailed regexp.

As a perl script: last10m.pl

#!/usr/bin/perl -wn

use strict;
use Date::Parse;
print if /^(\S+\s+\d+\s+\d+:\d+:\d+)\s/ && str2time($1) > time-600

Strictly: extract last 10 minutes from logfile

Meaning not relative to current time, but to last entry in logfile:

There is two way for retrieving end of period:

date -r logfile +%s
tail -n1 logfile | perl -MDate::Parse -nE 'say str2time($1) if /^(.{15})/'

Where logically, last modification time of the logfile must be the time of the last entry.

So the command could become:

perl -MDate::Parse -ne 'print if/^(.{15})\s/&&str2time($1)>'$(
    date -r logfile +%s)

or you could take the last entry as reference:

perl -MDate::Parse -E 'open IN,"<".$ARGV[0];seek IN,-200,2;while (<IN>) {
    $ref=str2time($1) if /^(\S+\s+\d+\s+\d+:\d+:\d+)/;};seek IN,0,0;
    while (<IN>) {print if /^(.{15})\s/&&str2time($1)>$ref-600}' logfile

Second version seem stronger, but access to file only once.

As a perl script, this could look like:

#!/usr/bin/perl -w

use strict;
use Date::Parse;
my $ref;                 # The only variable I will use in this.

open IN,"<".$ARGV[0];    # Open (READ) file submited as 1st argument
seek IN,-200,2;          # Jump to 200 character before end of logfile. (This
                         # could not suffice if log file hold very log lines! )
while (<IN>) {           # Until end of logfile...
    $ref=str2time($1) if /^(\S+\s+\d+\s+\d+:\d+:\d+)/;
};                       # store time into $ref variable.
seek IN,0,0;             # Jump back to the begin of file
while (<IN>) {
    print if /^(.{15})\s/&&str2time($1)>$ref-600;
}

But if you really wanna use

There is a very quick pure bash script:

Warning: This use recent bashisms, require $BASH_VERSION 4.2 or higher.

#!/bin/bash

declare -A month

for i in {1..12};do
    LANG=C printf -v var "%(%b)T" $(((i-1)*31*86400))
    month[$var]=$i
  done

printf -v now "%(%s)T" -1
printf -v ref "%(%m%d%H%M%S)T" $((now-600))

while read line;do
    printf -v crt "%02d%02d%02d%02d%02d" ${month[${line:0:3}]} \
        $((10#${line:4:2})) $((10#${line:7:2})) $((10#${line:10:2})) \
        $((10#${line:13:2}))
    # echo " $crt < $ref ??"   # Uncomment this line to print each test
    [ $crt -gt $ref ] && break
done
cat

Store this script and run:

cat >last10min.sh
chmod +x last10min.sh
sudo cat /var/log/syslog | ./last10min.sh

Strictly: extract last 10 minutes from logfile

Simply replace line 10, but you have to place filename in the script and not use it as a filter:

#!/bin/bash

declare -A month

for i in {1..12};do
    LANG=C printf -v var "%(%b)T" $(((i-1)*31*86400))
    month[$var]=$i
  done

read now < <(date -d "$(tail -n1 $1|head -c 15)" +%s)
printf -v ref "%(%m%d%H%M%S)T" $((now-600))

export -A month

{
    while read line;do
        printf -v crt "%02d%02d%02d%02d%02d" ${month[${line:0:3}]} \
            $((10#${line:4:2})) $((10#${line:7:2})) $((10#${line:10:2})) \
            $((10#${line:13:2}))
        [ $crt -gt $ref ] && break
    done
    cat
} <$1

A script into a function

As commented by ajcg, this could be nice to put efficient perl script into a bash function:

recentLog(){ 
    perl -MDate::Parse -ne '
        print if/^(.{'${3:-15}'})\s/ &&
            str2time($1)>time-'$((
                60*${2:-10}
            )) ${1:-/var/log/daemon.log}
}

Usage:

recentLog [filename] [minutes] [time sting length]

  • filename of log file
  • minutes max before now of lines to show
  • time sting length from begin of lines (default 15).
Purpleness answered 18/12, 2013 at 17:59 Comment(12)
This use big endian property of date, but correct text rendered month by using a bash associative array.Purpleness
./last10min.sh: line 6: printf: (': invalid format character ./last10min.sh: line 7: month[$var]: bad array subscript ./last10min.sh: line 9: printf: (': invalid format character ./last10min.sh: line 10: printf: (': invalid format character ./last10min.sh: line 17: [: 1809485400: unary operator expected ./last10min.sh: line 17: [: 1809544700: unary operator expected ./last10min.sh: line 17: [: 1809553300: unary operator expected`Callaghan
it doesnt work.. any idea?Callaghan
as for perl method: syntax error at ./test.py line 3, near "-ne" Execution of ./test.py aborted due to compilation errors.Callaghan
I have installed perl-TimeDate to clear the error. runing cat /root/scripts/logs_scripts/test.txt | perl -MDate::Parse -ne 'print if /^(\S+\s+\d+\s+\d+:\d+:\d+)\s/ && str2time($1) > time-600 show nothingCallaghan
it seems to work if its 10 minutes of actual time, what if i want the last 10 minutes relative to the end of the logCallaghan
Sorry, I've forgot to mention that bash script require Version 4.2 or higher.Purpleness
You could retrieve log unixtime by tail -n1 logfile | perl -MDate::Parse -nE 'say str2time($1) if /^(.{15})/' or date -r logfile +%s.Purpleness
@Callaghan I've modified my answer for adding perl script ready to use.Purpleness
See https://mcmap.net/q/22334/-how-can-i-read-lines-from-the-end-of-file-in-perl for how to read the logfile from the end: can help to optimize a bit, especially if you have to parse uncommon date formats that are not supported by Date::ParseCopperas
great. a question. how can I put 600 inside a variable to run it in a bash script (I already did it and it doesn't work. e.g: perl -MDate::Parse -ne 'print if/^(.{15})\s/&&str2time($1)>time-$bantime' and bantime=600)Paramorph
I had already asked the question and it was answered at this link #62897744Paramorph
E
7

You can match the date range using simple string comparison, for example:

d1=$(date --date="-10 min" "+%b %_d %H:%M")
d2=$(date "+%b %_d %H:%M")
while read line; do
    [[ $line > $d1 && $line < $d2 || $line =~ $d2 ]] && echo $line
done

For example if d1='Dec 18 10:19' and d2='Dec 18 10:27' then the output will be:

Dec 18 10:19:16
Dec 18 10:19:23
Dec 18 10:21:03
Dec 18 10:22:54
Dec 18 10:27:32

Or using awk if you wish:

awk -v d1="$d1" -v d2="$d2" '$0 > d1 && $0 < d2 || $0 ~ d2'
Enterogastrone answered 18/12, 2013 at 5:45 Comment(3)
What if there are some different months in the timestamps? I think the string comparisons can break down. e.g. [[ "Jan 18 10:27:32" < "Feb 18 10:27:32" ]] && echo "older" and [[ "Feb 18 10:27:32" < "Mar 18 10:27:32" ]] && echo "older" give inconsistent results.Belenbelesprit
@DigitalTrauma True, if the time period covers a change of month, my solution won't work. Your solution is more accurate, and you have my upvote.Enterogastrone
@Enterogastrone New bash version offer a builtin date function by using printf "%(...)T" -1. This is a lot quicker than using a fork to date. For using endianess property of date, you could correct month entry by using associative array. See my answer!Purpleness
B
1

In , you can use the date command to parse the timestamps. The "%s" format specifier converts the given date to the number of seconds since 1970-01-01 00:00:00 UTC. This simple integer is easy and accurate to do basic arithmetic on.

If you want the log messages from the last 10 minutes of actual time:

now10=$(($(date +%s) - (10 * 60)))

while read line; do
    [ $(date -d "${line:0:15}" +%s) -gt $now10 ] && printf "$line\n"
done < logfile

Note the ${line:0:15} expression is a bash parameter expansion which gives the first 15 characters of the line, i.e. the timestamp itself.

If you want the log messages from the last 10 minutes relative to the end of the log:

$ lastline=$(tail -n1 logfile)
$ last10=$(($(date -d "$lastline" +%s) - (10 * 60)))
$ while read line; do
> [ $(date -d "${line:0:15}" +%s) -gt $last10 ] && printf "$line\n"
> done < logfile
Dec 18 10:19:16
Dec 18 10:19:23
Dec 18 10:21:03
Dec 18 10:22:54
Dec 18 10:27:32
$ 

Here's a mild performance enhancement over the above:

$ { while read line; do
> [ $(date -d "${line:0:15}" +%s) -gt $last10 ] && printf "$line\n" && break
> done ; cat ; }  < logfile
Dec 18 10:19:16
Dec 18 10:19:23
Dec 18 10:21:03
Dec 18 10:22:54
Dec 18 10:27:32
$ 

This assumes the log entries are in strict chronological order. Once we match the timestamp in question, we exit the for loop, and then just use cat to dump the remaining entries.

Belenbelesprit answered 18/12, 2013 at 4:45 Comment(17)
I wonder why it return the following: ./test.sh: line 8: [: -gt: unary operator expected date: invalid date [pid 26481 on' ./test.sh: line 8: [: -gt: unary operator expected date: invalid date [pid 17563 on' ./test.sh: line 8: [: -gt: unary operator expected Dec 18 10:19:16 Dec 18 10:19:23 Dec 18 10:21:03 Dec 18 10:22:54 Dec 18 10:27:32Callaghan
-gt: unary operator expected im on centosCallaghan
@Callaghan - presumably your version of date can't parse the passed date string correctly. What output do you get from date --date="Dec 18 10:19:16"? What output do you get from date --version? Which version of centos are you using? I just installed 6.5 in a VM and it is working for me.Belenbelesprit
[root@s]# date --version date (GNU coreutils) 8.4 [root@s]# cat /etc/redhat-release CentOS release 6.4 (Final) [root@s]# date --date="Dec 18 10:19:16" Wed Dec 18 10:19:16 SGT 2013Callaghan
seem to work now. there was a problem with the log file. Thanks buddyCallaghan
${line:0:15} +%s is awesomeCallaghan
there is a problem tho.. if the data is Dec 02 15:33:39 Dec 02 15:40:03 Dec 02 15:44:23 vv Dec 02 15:44:57 vvCallaghan
[root@s ~]# date -d "tail -n1 /root/test.txt" +%s date: invalid date `Dec 02 15:44:57 vv'Callaghan
i got it. just add line:0:15.Callaghan
This is an example of bash parameter expansion. ${line:0:15} gives the first 15 characters of $line. gnu.org/software/bash/manual/html_node/…Belenbelesprit
What if the log file has a 10000 -100000 lines? It will take long to complete right?Callaghan
Yes, if you have many lines, this is almost certainly not the most efficient way of doing this, as date gets invoked for every line. The python solution is likely more performance-optimal, so long as it doesn't have to spawn external processes for each iteration.Belenbelesprit
Also see the latest edit which includes a mild performance enhancement https://mcmap.net/q/22331/-extract-last-10-minutes-from-logfile-duplicateBelenbelesprit
meaning it will still go through from the first line of the file until we match the timestamp in question? coz the log file is live and updated all the time in less than a secondCallaghan
Your solution implies 1 fork by lines, it's very expensive! Think about solutions using something like: paste <(sed 's/^\(.\{15\}\).*$/\1/' logfile | date -f - +%s ) logfile , use the commande time myscript.sh to evaluate the cost of your script...Purpleness
Break the loop, instead of test and print each lines from first matching one! See my modified answer!Purpleness
Yes - thats pretty good - I never noticed date can take a -f parameter.Belenbelesprit
I
0

In python, you could do as follows:

from datetime import datetime

astack=[]
with open("x.txt") as f:
    for aline in f:
        astack.append(aline.strip())
lasttime=datetime.strptime(astack[-1], '%b %d %I:%M:%S')
for i in astack:
    if (lasttime - datetime.strptime(i, '%b %d %I:%M:%S')).seconds <= 600:
        print i

Put the lines from the file into a stack (a python list). pop the last item and get difference between the successive date items until you get the difference as less than 600 seconds.

Running on your input, I get the following:

Dec 18 10:19:16
Dec 18 10:19:23
Dec 18 10:21:03
Dec 18 10:22:54
Dec 18 10:27:32
Inn answered 18/12, 2013 at 5:1 Comment(3)
Traceback (most recent call last): File "./test.py", line 9, in <module> lasttime=datetime.strptime(astack[-1], '%b %d %H:%M:%S') File "/usr/lib64/python2.6/_strptime.py", line 328, in _strptime data_string[found.end():]) ValueError: unconverted data remains: nhhCallaghan
what id the log file contain Dec 18 10:19:16 sdsd Dec 18 10:19:23 dsds Dec 18 10:21:03 dssd Dec 18 10:22:54 sdsds Dec 18 10:27:32 sdsCallaghan
is there a way to take just the first 15 characters of the line?Callaghan
W
0

A Ruby solution (tested on ruby 1.9.3)

You can pass days, hours, minutes or seconds as a parameter and it will search for the expression and on the file specified (or directory, in which case it will append '/*' to the name):

In your case just call the script like so: $0 -m 10 "expression" log_file

Note: Also if you know the location of 'ruby' change the shebang (first line of the script), for security reasons.

#! /usr/bin/env ruby

require 'date'
require 'pathname'

if ARGV.length != 4
        $stderr.print "usage: #{$0} -d|-h|-m|-s time expression log_file\n"
        exit 1
end
begin
        total_amount = Integer ARGV[1]
rescue ArgumentError
        $stderr.print "error: parameter 'time' must be an Integer\n"
        $stderr.print "usage: #{$0} -d|-h|-m|-s time expression log_file\n"
end

if ARGV[0] == "-m"
        gap = Rational(60, 86400)
        time_str = "%b %d %H:%M"
elsif ARGV[0] == "-s"
        gap = Rational(1, 86400)
        time_str = "%b %d %H:%M:%S"
elsif ARGV[0] == "-h"
        gap = Rational(3600, 86400)
        time_str = "%b %d %H"
elsif ARGV[0] == "-d"
        time_str = "%b %d"
        gap = 1
else
        $stderr.print "usage: #{$0} -d|-h|-m|-s time expression log_file\n"
        exit 1
end

pn = Pathname.new(ARGV[3])
if pn.exist?
        log = (pn.directory?) ? ARGV[3] + "/*" : ARGV[3]
else
        $stderr.print "error: file '" << ARGV[3] << "' does not exist\n"
        $stderr.print "usage: #{$0} -d|-h|-m|-s time expression log_file\n"
end

search_str = ARGV[2]
now = DateTime.now

total_amount.times do
        now -= gap
        system "cat " << log << " | grep '" << now.strftime(time_str) << ".*" << search_str << "'"
end
Whitethroat answered 23/4, 2014 at 17:40 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.