fuse: Setting offsets for the filler function in readdir
Asked Answered
S

2

5

I am implementing a virtual filesystem using the fuse, and need some understanding regarding the offset parameter in readdir.

Earlier we were ignoring the offset and passing 0 in the filler function, in which case the kernel should take care.

Our filesystem database, is storing: directory name, filelength, inode number and parent inode number.

How do i calculate get the offset?

Then is the offset of each components, equal to their size sorted in incremental form of their inode number? What happens is there is a directory inside a directory, is the offset in that case equal to the sum of the files inside?

Example: in case the dir listing is - a.txt b.txt c.txt
And inode number of a.txt=3, b.txt=5, c.txt=7

Offset of a.txt= directory offset
Offset of b.txt=dir offset + size of a.txt
Offset of c.txt=dir offset + size of b.txt

Is the above assumption correct?

P.S: Here are the callbacks of fuse

Sextet answered 9/5, 2014 at 12:52 Comment(2)
I've only ever used the offset=0 option. In the fuse source code (sourceforge.net/projects/fuse), they compute the next offset using fuse_add_direntry(dh->req, NULL, 0, name, NULL, 0); . My quick glance makes it look like that is generally 24+(strlen(name of directory)) rounded up to the nearest multiple of 8 bytes. My guess for your sample would be that you would pass 32 as the offset of a.txt (since that is the offset of the next item), 64 for b.txt, and 96 for c.txt.Hoes
The offset should only have meaning to the readdir() callback - not the Fuse system. Fuse only cares whether it's 0 or not-0. It internally handles the population of its opaque buffer. The math on 24+etc., isn't necessary. Please see my answer below.Aboriginal
H
3

The offset passed to the filler function is the offset of the next item in the directory. You can have the entries in the directory in any order you want. If you don't want to return an entire directory at once, you need to use the offset to determine what gets asked for and stored. The order of items in the directory is up to you, and doesn't matter what order the names or inodes or anything else is.

Specifically, in the readdir call, you are passed an offset. You want to start calling the filler function with entries that will be at this callback or later. In the simplest case, the length of each entry is 24 bytes + strlen(name of entry), rounded up to the nearest multiple of 8 bytes. However, see the fuse source code at http://sourceforge.net/projects/fuse/ for when this might not be the case.

I have a simple example, where I have a loop (pseudo c-code) in my readdir function:

int my_readdir(const char *path, void *buf, fuse_fill_dir_t filler, off_t offset, struct fuse_file_info *fi)
{
   (a bunch of prep work has been omitted)
   struct stat st;
   int off, nextoff=0, lenentry, i;
   char namebuf[(long enough for any one name)];

   for (i=0; i<NumDirectoryEntries; i++)
   {
      (fill st with the stat information, including inode, etc.)
      (fill namebuf with the name of the directory entry)
      lenentry = ((24+strlen(namebuf)+7)&~7);
      off = nextoff; /* offset of this entry */
      nextoff += lenentry;
      /* Skip this entry if we weren't asked for it */
      if (off<offset)
         continue;
      /* Add this to our response until we are asked to stop */
      if (filler(buf, namebuf, &st, nextoff))
         break;
   }
   /* All done because we were asked to stop or because we finished */
   return 0;
}

I tested this within my own code (I had never used the offset before), and it works fine.

Hoes answered 9/5, 2014 at 14:41 Comment(3)
Thanks a ton, it works great. Where did you get this example code from? Also can you please tell me, why ((24+strlen(namebuf)+7)&~7); ?Sextet
From the fuse_dirent_size function in fuse-2.9.3/lib/fuse_lowlevel.c and the header file fuse-2.9.3/include/fuse_kernel.h. It is the offset of the name parameter from the start of the fuse_dirent structure, plus the length of the filename, all rounded to a multiple of sizeof(__u64).Hoes
This might work in the case where you populate filler() all in one call to my_readdir(). But it probably won't work if you're giving filler() the file names in parts - multiple calls to my_readdir(). Knowledge of the structure of buf isn't necessary, and the formula to calculate the offset also is unnecessary. filler() only cares whether the offset is 0 or non-0. Please see my answer on this.Aboriginal
A
10

The selected answer is not correct

Despite the lack of upvotes on this answer, this is the correct answer. Cracking into the format of the void buffer should be discouraged, and that's the intent behind declaring such things void in C code - you shouldn't write code that assumes knowledge of the format of the data behind void pointers, use whatever API is provided properly instead.

The code below is very simple and straightforward, as it should be. No knowledge of the format of the Fuse buffer is required.

Fictitious API

This is a contrived example of what some device's API could look like. This is not part of Fuse.

// get_some_file_names() - 
// returns a struct with buffers holding the names of files.
// PARAMETERS
// * path   - A path of some sort that the fictitious device groks.
// * offset - Where in the list of file names to start.
// RETURNS
// * A name_list, it has some char buffers holding the file names
//   and a couple other auxiliary vars.
//   
name_list *get_some_file_names(char *path, size_t offset);

Listing the files in parts

Here's a Fuse callback that can be registered with the Fuse system to list the filenames provided by get_some_file_names(). It's arbitrarily named readdir_callback() so its purpose is obvious.

int readdir_callback(      char  *path, 
                           void  *buf,      // This is meant to be "opaque".
                fuse_fill_dir_t  *filler,   // filler takes care of buf.
                          off_t  off,       // Last value given to filler.
          struct fuse_file_info  *fi        )
{
    // Call the fictitious API to get a list of file names.
    name_list *list = get_some_file_names(path, off);

    for (int i = 0; i < list->length; i++)
    {
        // Feed the file names to filler() one at a time.
        if (filler(buf, list->names[i], NULL, off + i + 1)) 
        { 
            break;   // filler() returned 1, requesting a break.
        }
        incr_num_files_listed(list);
    }

    if (all_files_listed(list))
    {
        return 1;    // Tell Fuse we're done.
    }

    return 0;
}

The off (offset) value is not used by the filler function to fill its opaque buffer, buf. The off value is, however, meaningful to the callback as an offset base as it provides file names to filler(). Whatever value was last passed to filler() is what gets passed back to readdir_callback() on its next invocation. filler() itself only cares whether the off value is 0 or not-0.

Indicating "I'm done listing!" to Fuse

To signal to the Fuse system that your readdir_callback() is done listing file names in parts (when the last of the list of names has been given to filler()), simply return 1 from it.

How off Is Used

The off, offset, parameter should be non-0 to perform the partial listings. That's its only requirement as far as filler() is concerned. If off is 0, that indicates to Fuse that you're going to do a full listing in one shot (see below).

Although filler() doesn't care what the off value is beyond it being non-0, the value can still be meaningfully used. The code above is using the index of the next item in its own file list as its value. Fuse will keep passing the last off value it received back to the read dir callback on each invocation until the listing is complete (when readdir_callback() returns 1).

Listing the files all at once

int readdir_callback(      char  *path, 
                           void  *buf, 
                fuse_fill_dir_t  *filler, 
                          off_t  off,
          struct fuse_file_info  *fi        )
{
    name_list *list = get_all_file_names(path);

    for (int i = 0; i < list->length; i++)
    {
        filler(buf, list->names[i], NULL, 0);
    }
    return 0;
}

Listing all the files in one shot, as above, is simpler - but not by much. Note that off is 0 for the full listing. One may wonder, 'why even bother with the first approach of reading the folder contents in parts?'

The in-parts strategy is useful where a set number of buffers for file names is allocated, and the number of files within folders may exceed this number. For instance, the implementation of name_list above may only have 8 allocated buffers (char names[8][256]). Also, buf may fill up and filler() start returning 1 if too many names are given at once. The first approach avoids this.

Aboriginal answered 2/4, 2020 at 20:20 Comment(4)
This answer should not have been downvoted. The information it provides is sound. While the other (selected) answer may "work", it is not correct and unnecessarily pries into a void buffer and gives misleading information regarding it.Aboriginal
Think of it this way: The last value you passed as off (when the filler reports that it's buffer is full) is the value you receive when your function gets called again. This is useful because it allows both your application and the kernel to preserve memory in the case where not all directory entries are sent/received at once (respectively). If your application is for a filesystem that already provides all the files at once, you might as well send them all at once, but for example for network filesystems directory entries might arrive in chunks.Asbestosis
The off parameter is ignored by the kernel.Asbestosis
Your comments supplement the info on how off and filler() work well. Thanks @AsbestosisAboriginal
H
3

The offset passed to the filler function is the offset of the next item in the directory. You can have the entries in the directory in any order you want. If you don't want to return an entire directory at once, you need to use the offset to determine what gets asked for and stored. The order of items in the directory is up to you, and doesn't matter what order the names or inodes or anything else is.

Specifically, in the readdir call, you are passed an offset. You want to start calling the filler function with entries that will be at this callback or later. In the simplest case, the length of each entry is 24 bytes + strlen(name of entry), rounded up to the nearest multiple of 8 bytes. However, see the fuse source code at http://sourceforge.net/projects/fuse/ for when this might not be the case.

I have a simple example, where I have a loop (pseudo c-code) in my readdir function:

int my_readdir(const char *path, void *buf, fuse_fill_dir_t filler, off_t offset, struct fuse_file_info *fi)
{
   (a bunch of prep work has been omitted)
   struct stat st;
   int off, nextoff=0, lenentry, i;
   char namebuf[(long enough for any one name)];

   for (i=0; i<NumDirectoryEntries; i++)
   {
      (fill st with the stat information, including inode, etc.)
      (fill namebuf with the name of the directory entry)
      lenentry = ((24+strlen(namebuf)+7)&~7);
      off = nextoff; /* offset of this entry */
      nextoff += lenentry;
      /* Skip this entry if we weren't asked for it */
      if (off<offset)
         continue;
      /* Add this to our response until we are asked to stop */
      if (filler(buf, namebuf, &st, nextoff))
         break;
   }
   /* All done because we were asked to stop or because we finished */
   return 0;
}

I tested this within my own code (I had never used the offset before), and it works fine.

Hoes answered 9/5, 2014 at 14:41 Comment(3)
Thanks a ton, it works great. Where did you get this example code from? Also can you please tell me, why ((24+strlen(namebuf)+7)&~7); ?Sextet
From the fuse_dirent_size function in fuse-2.9.3/lib/fuse_lowlevel.c and the header file fuse-2.9.3/include/fuse_kernel.h. It is the offset of the name parameter from the start of the fuse_dirent structure, plus the length of the filename, all rounded to a multiple of sizeof(__u64).Hoes
This might work in the case where you populate filler() all in one call to my_readdir(). But it probably won't work if you're giving filler() the file names in parts - multiple calls to my_readdir(). Knowledge of the structure of buf isn't necessary, and the formula to calculate the offset also is unnecessary. filler() only cares whether the offset is 0 or non-0. Please see my answer on this.Aboriginal

© 2022 - 2024 — McMap. All rights reserved.