How to cleanup the graphite whisper's data?
Asked Answered
I

3

93

I want to delete the graphite's storage whisper's data but there ain't anything in the graphite docs.

One way I did is deleting the the files at /opt/graphite...../whispers/stats... manually.

But this is tedious, so how do I do it?

Interference answered 6/3, 2012 at 15:53 Comment(1)
in case they appear again after deleting them, check this other question: #15502177Fritzsche
P
77

Currently, deleting files from /opt/graphite/storage/whisper/ is the correct way to clean up whisper data.

As for the tedious side of the process, you could use the find command if there is a certain pattern that your trying to remove.

find /opt/graphite/storage/whisper -name loadavg.wsp -delete

Similar Question on answers.launchpad.net/graphite

Persian answered 30/4, 2012 at 12:58 Comment(5)
I'm using graphite + statsd. I tried this way and it works, but after a while the bucket is recreated. Any idea why and how to stop it?Octosyllabic
How do you restart statsd? I am not finding a statsd in the list of processes, but I am having this problem.Malmsey
It should be noted that after deleting the unused paths, Graphite itself does not have to be restarted. Statsd is a separate issue, go ahead and restart it, but Graphite will deal with deleted paths just fine. I thought I should clarify this because it was a stumbling block for me at some point.Capillary
What about search_index? Should it be also deleted or truncated?Claqueur
Is there no way to see if all the data has expired (e.g. maxRetention has passed since the last update)? To remove old expired .wsp files?Laktasic
B
51

I suppose that this is going into Server Fault territory, but I added the following cron job to delete old metrics of ours that haven't been written to for over 30 days (e.g. of cloud instances that have been disposed):

find /mnt/graphite/storage -mtime +30 | grep -E \
"/mnt/graphite/storage/whisper/collectd/app_name/[^/]*" -o \
| uniq | xargs rm -rf

This will delete directories which have valid data.

First:

find whisperDir -mtime +30 -type f | xargs rm 

And then delete empty dirs

find . -type d -empty | xargs rmdir

This last step should be repeated, because may be new empty directories will be left.

Boudicca answered 20/9, 2012 at 14:41 Comment(3)
On almost all modern Unix systems this should be possible to condense using find builtins - e.g. find /opt/graphite/storage/whisper -type f -mtime +120 -name \*.wsp -delete; find /opt/graphite/storage/whisper -depth -type d -empty -deleteDormouse
FYI in ubuntu the path is /var/lib/graphite/whisperFritzsche
Is there a reason we can't use tmpreaper to do this?Katrinakatrine
P
9

As people have pointed out, removing the files is the way to go. Expanding on previous answers, I made this script that removes any file that has exceeded its max retention age. Run it as a cronjob fairly regularly.

#!/bin/bash
d=$1
now=$(date +%s)

MINRET=86400

if [ -z "$d" ]; then
  echo "Must specify a directory to clean" >&2
  exit 1
fi

find $d -name '*.wsp' | while read w; do
  age=$((now - $(stat -c '%Y' "$w")))
  if [ $age -gt $MINRET ]; then
    retention=$(whisper-info.py $w maxRetention)
    if [ $age -gt $retention ]; then
      echo "Removing $w ($age > $retention)"
      rm $w
    fi
  fi
done

find $d -empty -type d -delete

A couple of bits to be aware of - the whisper-info call is quite heavyweight. To reduce the number of calls to it I've put the MINRET constant in, so that no file will be considered for deletion until it is 1 day old (24*60*60 seconds) - adjust to fit your needs. There are probably other things that can be done to shard the job or generally improve its efficiency, but I haven't had need to as yet.

Pillow answered 15/3, 2016 at 15:1 Comment(2)
nit: Must specify a directory to clean is an error message. As such, it should be written to the correct place: echo "Must ..." >&2.Aeciospore
this is great tyvm!Noreen

© 2022 - 2024 — McMap. All rights reserved.