Use tee (or equivalent) but limit max file size or rotate to new file
Asked Answered
E

7

35

I would like to capture output from a UNIX process but limit max file size and/or rotate to a new file.

I have seen logrotate, but it does not work real-time. As I understand, it is a "clean-up" job that runs in parallel.

What is the right solution? I guess I will write a tiny script to do it, but I was hoping there was a simple way with existing text tools.

Imagine:

my_program | tee --max-bytes 100000 log/my_program_log

Would give... Always writing latest log file as: log/my_program_log

Then, as it fills... renamed to log/my_program_log000001 and start a new log/my_program_log.

Elevate answered 15/7, 2011 at 14:28 Comment(0)
O
34

use split:

my_program | tee >(split -d -b 100000 -)

Or if you don't want to see the output, you can directly pipe to split:

my_program | split -d -b 100000 -

As for the log rotation, there's no tool in coreutils that does it automatically. You could create a symlink and periodically update it using a bash command:

while ((1)); do ln -fns target_log_name $(ls -t | head -1); sleep 1; done
Orozco answered 3/9, 2011 at 7:31 Comment(6)
Bah... I forgot about the >() operator in Bash (and some other shells). I use it too infrequently. Yours is the most concise answer.Elevate
For the first solution using tee, is there a reason why I shouldn't use my_program | tee | split -d -b 100000 -?`Tellurian
for a decent file naming use for example tee >(split --additional-suffix=.log -d -b 1000000 - debug.0)Immoralist
Brilliant way of using tee and split together.Precontract
@asafc: tee reads stdin and writes two copies, one to a file and the other to tee's stdout. It is the file output that needs to be split/rotated. Your suggestion will consume the stdout copy that's meant to display on a terminal, thus defeating the whole reason for using teeUther
Also, if you're working on line-based text content and don't want to break up lines in output-files, just use -l <lines> instead of -b <bytes>Insouciant
C
9

In package apache2-utils is present utility called rotatelogs, it fully meet to your requirements.

Synopsis:

rotatelogs [ -l ] [ -L linkname ] [ -p program ] [ -f ] [ -t ] [ -v ] [ -e ] [ -c ] [ -n number-of-files ] logfile rotationtime|filesize(B|K|M|G) [ offset ]

Example:

your_program | rotatelogs -n 5 /var/log/logfile 1M

Full manual you may read on this link.

Campobello answered 1/7, 2017 at 13:56 Comment(1)
With -n 5, you would start writing to first file each time program starts. Instead if you set -n 1 it would write to same file each time giving us continuous logsTwofaced
H
5

or using awk

program | awk 'BEGIN{max=100} {n+=length($0); print $0 > "log."int(n/max)}'

It keeps lines together, so the max is not exact, but this could be nice especially for logging purposes. You can use awk's sprintf to format the file name.

Here's a pipable script, using awk

#!/bin/bash
maxb=$((1024*1024))    # default 1MiB
out="log"              # output file name
width=3                # width: log.001, log.002
while getopts "b:o:w:" opt; do
  case $opt in
    b ) maxb=$OPTARG;;
    o ) out="$OPTARG";;
    w ) width=$OPTARG;;
    * ) echo "Unimplented option."; exit 1
  esac
done
shift $(($OPTIND-1))

IFS='\n'              # keep leading whitespaces
if [ $# -ge 1 ]; then # read from file
  cat $1
else                  # read from pipe
  while read arg; do
    echo $arg
  done
fi | awk -v b=$maxb -v o="$out" -v w=$width '{
    n+=length($0); print $0 > sprintf("%s.%0.*d",o,w,n/b)}'

save this to a file called 'bee', run 'chmod +x bee' and you can use it as

program | bee

or to split an existing file as

bee -b1000 -o proglog -w8 file
Hagiolatry answered 26/1, 2012 at 17:44 Comment(1)
I agree with your comment: "It keeps lines together, so the max is not exact, but this could be nice especially for logging purposes."Elevate
A
5

To limit the size to 100 bytes, you can simply use dd:

my_program | dd bs=1 count=100 > log

When 100 bytes are written, dd will close the pipe and my_program receives EPIPE.

Acanthoid answered 7/5, 2012 at 17:8 Comment(0)
D
2

The most straightforward way to solve this is probably to use python and the logging module which was designed for this purpose. Create a script that read from stdin and write to stdout and implement the log-rotation described below.

The "logging" module provides the

class logging.handlers.RotatingFileHandler(filename, mode='a', maxBytes=0,
              backupCount=0, encoding=None, delay=0)

which does exactly what you are asking about.

You can use the maxBytes and backupCount values to allow the file to rollover at a predetermined size.

From docs.python.org

Sometimes you want to let a log file grow to a certain size, then open a new file and log to that. You may want to keep a certain number of these files, and when that many files have been created, rotate the files so that the number of files and the size of the files both remain bounded. For this usage pattern, the logging package provides a RotatingFileHandler:

import glob
import logging
import logging.handlers

LOG_FILENAME = 'logging_rotatingfile_example.out'

# Set up a specific logger with our desired output level
my_logger = logging.getLogger('MyLogger')
my_logger.setLevel(logging.DEBUG)

# Add the log message handler to the logger
handler = logging.handlers.RotatingFileHandler(
              LOG_FILENAME, maxBytes=20, backupCount=5)

my_logger.addHandler(handler)

# Log some messages
for i in range(20):
    my_logger.debug('i = %d' % i)

# See what files are created
logfiles = glob.glob('%s*' % LOG_FILENAME)

for filename in logfiles:
    print(filename)

The result should be 6 separate files, each with part of the log history for the application:

logging_rotatingfile_example.out
logging_rotatingfile_example.out.1
logging_rotatingfile_example.out.2
logging_rotatingfile_example.out.3
logging_rotatingfile_example.out.4
logging_rotatingfile_example.out.5

The most current file is always logging_rotatingfile_example.out, and each time it reaches the size limit it is renamed with the suffix .1. Each of the existing backup files is renamed to increment the suffix (.1 becomes .2, etc.) and the .6 file is erased.

Obviously this example sets the log length much much too small as an extreme example. You would want to set maxBytes to an appropriate value.

Duello answered 15/7, 2011 at 18:41 Comment(1)
I am confused. My program is not Python. How does this help me? I want to use standard GNU coreutils: awk/tee/split/etc.Elevate
H
1

Another solution will be to use Apache rotatelogs utility.

Or following script:

#!/bin/ksh
#rotatelogs.sh -n numberOfFiles pathToLog fileSize[B|K|M|G]
numberOfFiles=10
while getopts "n:fltvecp:L:" opt; do
    case $opt in
  n) numberOfFiles="$OPTARG"
    if ! printf '%s\n' "$numberOfFiles" | grep '^[0-9][0-9]*$' >/dev/null;     then
      printf 'Numeric numberOfFiles required %s. rotatelogs.sh -n numberOfFiles pathToLog fileSize[B|K|M|G]\n' "$numberOfFiles" 1>&2
      exit 1
    elif [ $numberOfFiles -lt 3 ]; then
      printf 'numberOfFiles < 3 %s. rotatelogs.sh -n numberOfFiles pathToLog fileSize[B|K|M|G]\n' "$numberOfFiles" 1>&2
    fi
  ;;
  *) printf '-%s ignored. rotatelogs.sh -n numberOfFiles pathToLog fileSize[B|K|M|G]\n' "$opt" 1>&2
  ;;
  esac
done
shift $(( $OPTIND - 1 ))
pathToLog="$1"
fileSize="$2"
if ! printf '%s\n' "$fileSize" | grep '^[0-9][0-9]*[BKMG]$' >/dev/null; then
  printf 'Numeric fileSize followed by B|K|M|G required %s. rotatelogs.sh -n numberOfFiles pathToLog fileSize[B|K|M|G]\n' "$fileSize" 1>&2
  exit 1
fi
sizeQualifier=`printf "%s\n" "$fileSize" | sed "s%^[0-9][0-9]*\([BKMG]\)$%\1%"`
multip=1
case $sizeQualifier in
B) multip=1 ;;
K) multip=1024 ;;
M) multip=1048576 ;;
G) multip=1073741824 ;;
esac
fileSize=`printf "%s\n" "$fileSize" | sed "s%^\([0-9][0-9]*\)[BKMG]$%\1%"`
fileSize=$(( $fileSize * $multip ))
fileSize=$(( $fileSize / 1024 ))
if [ $fileSize -le 10 ]; then
  printf 'fileSize %sKB < 10KB. rotatelogs.sh -n numberOfFiles pathToLog fileSize[B|K|M|G]\n' "$fileSize" 1>&2
  exit 1
fi
if ! touch "$pathToLog"; then
  printf 'Could not write to log file %s. rotatelogs.sh -n numberOfFiles pathToLog fileSize[B|K|M|G]\n' "$pathToLog" 1>&2
  exit 1
fi
lineCnt=0
while read line
do
  printf "%s\n" "$line" >>"$pathToLog"
  lineCnt=$(( $lineCnt + 1 ))
  if [ $lineCnt -gt 200 ]; then
    lineCnt=0
    curFileSize=`du -k "$pathToLog" | sed -e 's/^[  ][  ]*//' -e 's%[   ][  ]*$%%' -e 's/[  ][  ]*/[    ]/g' | cut -f1 -d" "`
    if [ $curFileSize -gt $fileSize ]; then
      DATE=`date +%Y%m%d_%H%M%S`
      cat "$pathToLog" | gzip -c >"${pathToLog}.${DATE}".gz && cat /dev/null >"$pathToLog"
      curNumberOfFiles=`ls "$pathToLog".[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]_[0-9][0-9][0-9][0-9][0-9][0-9].gz | wc -l | sed -e 's/^[   ][  ]*//' -e 's%[   ][  ]*$%%' -e 's/[  ][  ]*/[    ]/g'`
      while [ $curNumberOfFiles -ge $numberOfFiles ]; do
        fileToRemove=`ls "$pathToLog".[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]_[0-9][0-9][0-9][0-9][0-9][0-9].gz | head -1`
        if [ -f "$fileToRemove" ]; then
          rm -f "$fileToRemove"
          curNumberOfFiles=`ls "$pathToLog".[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]_[0-9][0-9][0-9][0-9][0-9][0-9].gz | wc -l | sed -e 's/^[   ][  ]*//' -e 's%[   ][  ]*$%%' -e 's/[  ][  ]*/[    ]/g'`
        else
          break
        fi
      done
    fi
  fi
done
Hardiness answered 1/6, 2016 at 18:36 Comment(2)
The script was not love at first sight but having to install apache just to get the rotatelog made me look at it again and I have to say it's pretty neat!Selfimportant
Performance: When I let a testapp spam to stdout and count the records, I see rotatelog with no difference to testapp > mylogfile. while with the script we manage to swallow and write about 40% of the records in a time delta.Selfimportant
W
1

Limiting the max size can also be done with head:

my_program | head -c 100  # Limit to 100 first bytes

See this for benefits over dd: https://unix.stackexchange.com/a/121888/

Worlock answered 12/7, 2020 at 0:27 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.