Addendum

In association with heise online

File system debugger

The Debugfs program which is part of the E2fsprogs package is a wonderful tool for exploring the intricacies of Ext3. The commands

stat file_name

stat <inode number>

return all the information pertaining to an inode - including the data blocks used by the file.

mi can be used to modify inode contents - sometimes the last resort when trying to rescue a corrupted file. ls serves for exploring directory structures, and stats returns details about the superblock. lsdel outputs a list of deleted files in Ext2. ncheck detects the file name belonging to an inode but doesn't take hard links into account. icheck returns all the inodes pointing to a data block. cat writes the contents of a file to screen, dump to a new file. help returns all the commands debugfs understands.

Debugfs can safely be applied to file systems which are mounted and in use as it opens them for read access only by default. However, if you are using

debugfs -w device

to write to your file system, you should bear in mind that this tool can quickly demolish the administrative structures of Ext3 to a degree which even e2fsck won't be able to fix.

Measuring fragmentation

To determine the amount of fragmentation within a file we need to find out which data blocks the file uses and match the data block sequence with the ideal block sequence as shown.

In unfragmented files, blocks are arranged in the order in which they are read by the file system.

A file's data blocks can be determined in three different ways. The direct way is to read the disk sector by sector and interpret the individual Ext3 data structures, which eventually results in having to reprogram the Ext3 driver to a large degree.

Thankfully, this isn't necessary: The libext2fs library provides routines for directly accessing file system driver functionality. The ext2fs_get_next_inode() and ext2fs_block_iterate() functions iterate all the inodes used by the file system and all the data blocks belonging to an inode. An arbitrary function called with each data block's block number can be included in ext2fs_get_next_inode() to compare the file's actual disk layout with the ideal block sequence. This approach is used by e2fsck and our ext2_frag tool.

ext2_frag needs to be called with the name of the device to be tested (partition or logical volume on LVM systems):

ext2_frag DEVICE [-s|-f|-v|-d] [-i INODE]

By default, the program checks the entire file system and returns whether and to what degree each file is fragmented. Option -f limits the output to fragmented files for better clarity; -s only returns the statistical overviews of the entire file system. So, the fragmentation of the file system at /dev/sda1 can be assessed by calling

ext2_frag -s /dev/sda1

Option -i investigates the fragmentation details of individual files identified by their inode numbers; -v and -d return further details about the file's block layout.

A file's block numbers can alternatively also be retrieved with the FIBMAP Ioctl, although this only works with regular files and not with directories. This is used by the filefrag tool included in the E2fsprogs package and by fragments. Unlike ext2fs_block_iterate(), ioctl(FIBMAP) only returns the numbers of the data blocks, but not the numbers of the indirect blocks – those can only be identified as gaps in the data block sequence.

The fragments tool needs to be called with a directory name:

fragments [-r] [-d] [-f] [-b] [-x] DIRECTORY

With option -r, the tool recursively also examines the named directory's subdirectories; -d returns statistics for every subdirectory. Adding -x will make the tool ignore subdirectories where other file systems are mounted.

A whole file system is typically examined with