Most files need only a few extents to describe their logical-to-physical block mapping, which can be accommodated within the inode or a single extent map block. However, some extreme cases, such as sparse files with random allocation patterns, or a very badly fragmented filesystem, are not efficiently represented using extent maps. In addition, allocating blocks in a random access pattern may require inserting an extent map entry in the middle of a potentially very large data representation.
One solution to this problem is to use a tree data structure to store
the extent map, either a B-tree, B+ tree, or some simplified tree
structure as was used for the HTree feature. Alex Tomas's
implementation takes the latter approach, using a constant-depth tree
structure. In this implementation, the extents are expressed using a 12
byte structure, which include a 32-bit logical block number, a 48-bit
physical block number, and a 16-bit extent length. With 4 KB blocksize,
a filesystem can address up to 1024 petabytes, and a maximum file size
of 16 terabytes. A single extent can cover up to 2#2 blocks or
256 MB.
The extent tree information can be stored in the inode's i_data array, which is 60 bytes long. An attribute flag in the inode's i_flags word indicates whether the inode's i_data array should be interpreted using the traditional indirect block mapping scheme, or as an extent data structure. If the entire extent information can be stored in the i_data field, then it will be treated as a single leaf node of the extent tree; otherwise, it will be treated as the root node of inode's extent tree, and additional filesystem blocks serve as intermediate or leaf nodes in the extent tree.
At the beginning of each node, the ext3_ext_header data structure is 12 bytes long, and contains a 16-bit magic number, 2 16-bit integers containing the number of valid entries in the node, and the maximum number of entries that can be stored in the node, a 16-bit integer containing the depth of the tree, and a 32-bit tree generation number. If the depth of the tree is 0, then root inode contains leaf node information, and the 12-byte entries contain the extent information described in the previous paragraph. Otherwise, the root node will contain 12-byte intermediate entries, which consist of 32-bit logical block and a 48-bit physical block (with 16 bits unused) of the next index or leaf block.