Understanding sincedb files from Logstash file input

Asked 16/1, 2015 at 13:28 Answered 24/10, 2016 at 17:57

When using the file input with Logstash, a sincedb file is written in order to keep track of the current position of monitored log files. How to understand its contents?

Example of a sincedb file:

 286105 0 19 20678374

Sylas answered 16/1, 2015 at 13:28 Comment(0)

There are 4 fields (source):

inode
major device number
minor device number
byte offset

Assuming that a hard disk would be segmented in thousands of very tiny parts with a number for each one, the inode would be more or less like the number of the tiny part where the file begins. So a given inode is unique to each hard disk, but in order to address cases where there are multiple disks on the same server, using major and minor device number is required in order to guarantee uniqueness of the triplet {inode, minor device number, minor device number}. More accurate info about inodes on Wikipedia.

That said, I am not so sure that (for example) files mounted through NFS could not collide with local files since the inode of a file mounted through NFS seems to be the remote one. Even though I don't think that the plugin writer bothered about such cases, and despite using NFS myself, never ran into any trouble so far. Also I suspect the collision probability to be very tiny.

Now with the triplet formed by inode and major and minor device number we have a way of targeting the single log file that is being read by the plugin without error (or at least that was the original intent). The last number, the byte offset, keeps track of how far the input log file as already been read and outputted to Logstash.

In some specific architectures like Solaris or Windows there have been bugs with ruby wrongly detecting the inode number, which was equal to 0. This could for example lead to issues like logstash not detecting a file rotation.

Sylas answered 16/1, 2015 at 13:28 Comment(2)

Why not describe all 4 fields in your self-answer, and maybe with more authority? – Singlefoot 16/1, 2015 at 15:3

I am merely giving rare info that was already hard to gather. My purpose was not to begin a wikipedia article. Anyway you made me realize that "byte offset" was not obvious for everybody, so I added some more info. – Sylas 16/1, 2015 at 15:53

This was super helpful. I wanted to map all my SinceDB files to the logstash inputs, so I put together a little bash two-liner to print this mapping.

filesystems=$(grep path /etc/logstash/conf.d/*.conf | awk -F'=>' '{ print $2 }' | xargs -I {} df -P {} 2>/dev/null | grep -v Filesystem | sort | uniq | cut -d' ' -f 1)
for fs in $filesystems; do for f in $(ls -a .sincedb_*); do echo $f; inodes=$(cut -d' ' -f 1 $f); for inode in $inodes; do sudo debugfs -R "ncheck $inode" $fs 2>/dev/null | grep -v Inode | cut -f 2; done; echo; done; done

I just documented the details about mapping SinceDB files to logstash input.

Zygospore answered 24/10, 2016 at 17:57 Comment(0)

Recommended topics

Hot tags