Introduction
As accepted answer from matchew is wrong, regarding Antoine's comment: Because awk
will do alphanumeric comparisons. So if you logfile list events across the end and begin of two months:
[27/Feb/2023:00:00:00
[28/Feb/2023:00:00:00
[01/Mar/2023:00:00:00
awk
will consider:
[01/Mar/2023:00:00:00
< [27/Feb/2023:00:00:00
< [28/Feb/2023:00:00:00
Wich is wrong! You have to compare date stings!!
For this, you could use libraries. Conforming to the language
you use.
I will present here two different way, one using perl with Date::Parse
library, and another (quicker), using bash with GNU/date.
As this is a common perl task
And because this is not exactly same than extract last 10 minutes from logfile where it's about a bunch of time upto the end of logfile.
And because I've needed them, I (quickly) wrote this:
#!/usr/bin/perl -ws
# This script parse logfiles for a specific period of time
sub usage {
printf "Usage: %s -s=<start time> [-e=<end time>] <logfile>\n";
die $_[0] if $_[0];
exit 0;
}
use Date::Parse;
usage "No start time submited" unless $s;
my $startim=str2time($s) or die;
my $endtim=str2time($e) if $e;
$endtim=time() unless $e;
usage "Logfile not submited" unless $ARGV[0];
open my $in, "<" . $ARGV[0] or usage "Can't open '$ARGV[0]' for reading";
$_=<$in>;
exit unless $_; # empty file
# Determining regular expression, depending on log format
my $logre=qr{^(\S{3}\s+\d{1,2}\s+(\d{2}:){2}\d+)};
$logre=qr{^[^\[]*\[(\d+/\S+/(\d+:){3}\d+\s\+\d+)\]} unless /$logre/;
while (<$in>) {
/$logre/ && do {
my $ltim=str2time($1);
print if $endtim >= $ltim && $ltim >= $startim;
};
};
This could be used like:
./timelapsinlog.pl -s=09:18 -e=09:24 /path/to/logfile
for printing logs between 09h18 and 09h24.
./timelapsinlog.pl -s='2017/01/23 09:18:12' /path/to/logfile
for printing from january 23th, 9h18'12"
upto now.
In order to reduce perl code, I've used -s
switch to permit auto-assignement of variables from commandline: -s=09:18
will populate a variable $s
wich will contain 09:18
. Care to not miss the equal sign =
and no spaces!
Nota: This hold two diffent kind of regex for two different log standard. If you require different date/time format parsing, either post your own regex or post a sample of formatted date from your logfile
^(\S{3}\s+\d{1,2}\s+(\d{2}:){2}\d+) # ^Jan 1 01:23:45
^[^\[]*\[(\d+/\S+/(\d+:){3}\d+\s\+\d+)\] # ^... [01/Jan/2017:01:23:45 +0000]
Quicker** bash
version:
Answering to Gilles Quénot's comment, I've tried to create a bash version.
As this version seem quicker than perl version, You may found a full version of grepByDates.sh
with comments on my website (not on gith...), I post here a shorter version:
#!/bin/bash
prog=${0##*/}
usage() {
cat <<EOUsage
Usage: $prog <start date> <end date> <logfile>
Each argument are required. End date could by `now`.
EOUsage
}
die() {
echo >&2 "ERROR $prog: $*"
exit 1
}
(($#==3))|| { usage; die 'Wrong number of arguments.';}
[[ -f $3 ]] || die "File not found."
# Conversion of argument to EPOCHSECONDS by asking `date` for the two conversions
{
read -r start
read -r end
} < <(
date -f - +%s <<<"$1"$'\n'"$2"
)
# Determing wich kind of log format, between "apache logs" and "system logs":
read -r oline <"$3" # read one log line
if [[ $oline =~ ^[^\ ]{3}\ +[0-9]{1,2}\ +([0-9]{2}:){2}[0-9]+ ]]; then
# Look like syslog format
sedcmd='s/^\([^ ]\{3\} \+[0-9]\{1,2\} \+\([0-9]\{2\}:\)\{2\}[0-9]\+\).*/\1/'
elif [[ $oline =~ ^[^\[]+\[[0-9]+/[^\ ]+/([0-9]+:){3}[0-9]+\ \+[0-9]+\] ]]; then
# Look like apache logs
sedcmd='s/^[0-9.]\+ \+[^ ]\+ \+[^ ]\+ \[\([^]]\+\)\].*$/\1/;s/:/ /;y|/|-|'
else
die 'Log format not recognized'
fi
# Print lines begining by `1<tabulation>`
sed -ne s/^1\\o11//p <(
# paste `bc` tests with log file
paste <(
# bc will do comparison against EPOCHSECONDS returned by date and $start - $end
bc < <(
# Create a bc function for testing against $start - $end.
cat <<EOInitBc
define void f(x) {
if ((x>$start) && (x<$end)) { 1;return ;};
0;}
EOInitBc
# Run sed to extract date strings from logfile, then
# run date to convert string to EPOCHSECONDS
sed "$sedcmd" <"$3" |
date -f - +'f(%s)'
)
) "$3"
)
Explanation
- Script run
sed
to extract date strings from logfile
- Pass date strings to
date -f - +%s
to convert in one run all strings to EPOCH (Unix Timestamp).
- Run
bc
for the tests: print 1
if min > date > max
or else print 0
.
- Run
paste
to merge bc
output with logfile.
- Finally run
sed
to find lines that match 1<tab>
then replace match with nothing, then print.
So this script will fork 5 subprocess to do dedicated things by specialised tools, but won't do shell loop
against each lines of logfile!
** Note:
Of course, this is quicker on my host because I run on a multicore processor, each task run parallelized!!
Conclusion:
This is not a program! This is an aggregation script!
If you consider bash not as a programming language, but as a super language or a tools aggregator, you could take the full power of all your tools!!