tytso/e2fsprogs

Question about de->file_type

mikaku opened this issue · 2 comments

Not an issue, just a question that I failed to find the exact answer on Internet.

I'm implementing the ext2 filesystem on my hobby OS and I want to know who will benefit from the file_type.

I know that the inclusion of the file_type in the directory entry in revision 0.5 might help to discover the type of the file, but it seems that Linux kernels 2.0, 2.2 and 2.4 (old kernels I know, but still valid to learn) are only updating this field, and they don't seem to use it.

So hence my question. Is the purpose of this field only for e2fsprogs tools? I mean, basically for fsck, debugfs, etc.?

tytso commented

The kernel passes the file_type back up to userspace. See [1], [2] and [3]. To quote from [4], when describing the d_type field which is returned by readdir:

This field contains a value indicating the file type, making it possible to avoid the expense of calling lstat(2) if further actions depend on the type of the file.

Example of utilities that use the file_type field including the find(1) program from GNU, where if a find command such as "find . -name super.c -print", use of the file_type field reduces the number of system calls by a huge percentage, because it doesn't become necessary to call stat(2) or lstat(2) on every single file returned by readdir to see if it is a directory that needs to be recursively searched.

Now, this extension is not required by POSIX or the Single Unix Specification, so feel free to not bother to implement in your hobby OS if you don't care about performance for applications that need to recursively search files in a directory tree. The NetBSD. FreeBSD, et. al., also implement, so ti's not a Linux specific extension.

[1] https://elixir.bootlin.com/linux/v6.3/source/fs/readdir.c#L256
[2] https://elixir.bootlin.com/linux/v6.3/source/fs/ext4/dir.c#L536
[3] https://elixir.bootlin.com/linux/v6.3/source/include/linux/fs.h#L3136
[4] https://man7.org/linux/man-pages/man3/readdir.3.html

Oh, I see, the dirent structure in newer Linux kernels includes the d_type field:

 struct dirent {
       ino_t          d_ino;       /* Inode number */
       off_t          d_off;       /* Not an offset; see below */
       unsigned short d_reclen;    /* Length of this record */
       unsigned char  d_type;      /* Type of file; not supported
                                      by all filesystem types */
       char           d_name[256]; /* Null-terminated filename */
   };

In my case, I'm still using the old dirent structure, so user space programs are unable to take advantage of this field.
That's clears it up.

@tytso, thank you very much for your explanation.