CpSc 360

Lecture 18

 

Chapter 9 – The B+ Tree Family and Indexed Sequential File Access

 

Indexed Sequential Access

 

·       Files must often be accessed directly by a key value or sequentially by order of a key.

·       We wish to design files that can be quickly access in either manner.

 

 

Maintaining a Sequence Set

 

The Use of Blocks

·       The set of records that we choose to access sequentially is called a sequence set.

·       Sorting the records in the file to keep them in order as new records are added is infeasible.

·       Blocking records is a good way to reduce the number of seeks when going from one record to another sequentially since the entire block will be in RAM.

·       The blocks of the file can be linked together to keep them sequentially ordered.

·       When a block is full and a record is to be added, the block is split, an new block is allocated in the file, and half of the records are put into the new block and half left in the old block.  The blocks are then linked together to maintain the sequential property.

·       When deletions result in a block being less than half-full, a redistribution of the records can be performed (with an adjacent block) or the block can possibly be concatenated with a neighboring block and the original block can be eliminated from the file.

·       After much file activity (adds and deletes) the file can become somewhat sparse (many blocks can be less than full) and have a low density.

 

Choice of Block Size

·       Big blocks mean much physical adjacency of sorted records and less need to go to blocks that aren’t physically adjacent.

·       Why not make the entire file one big block then?

·       We may not have enough RAM to hold the entire file in a block.  In general we want to use RAM to hold several blocks for buffering, splitting, or concatenations.

·       I/O should be fast.  If the entire file were in one big block then we would always read the entire file in even if we only want access to a single record.

·       The block size should be such that we can access a block without having to bear the cost of a disk seek within the block read or block write operations.

·       A good block size might be a cluster since a cluster is the minimum number of sectors allocated at a time (and are physically adjacent).  This might be tempered by available RAM.

 

Adding a Simple Index to the Sequence Set

 

·       A non-dense index (one that does not include an entry for each record in the sequence set) can be created so that the key of the largest record in each block is represented in the index.

·       If the index will fit in RAM then binary searching can be used to quickly find a candidate block for a particular key.

·       If the index cannot fit in RAM then this is probably a poor choice of index structures.

·       A B-Tree is a good choice of index structures if the index cannot fit in RAM.

·       A B+Tree is a combination of a B-Tree used for an index (called an index set) and a sequentially organized data file (called an sequence set).

 

The Content of the Index:         Separators Instead of Keys

 

·       We don’t actually need the entire key in the index.  We only need that part of the key (a prefix for the key) that leads us to the correct block where the record may reside. 

·       This part of the key is called a separator.

·       The separator should be (other methods will also work) be a prefix for the last key in a block, and cannot be a prefix for the first record in the next block.

 

The Simple Prefix B+Tree

 

·       The B-Tree used to store the index set actually stores separators, not keys to data records.  These separators are managed like data in any other B-Tree discussed in Chapter 8.

·       Techniques also exist for reducing the size of separators whose first several characters are repeated.  Theses techniques are called front and rear key compression techniques.

 

Simple Prefix B+Tree Maintenance

 

·       Deletion of records from the sequence set of a B+Tree results in changes to the index set only if the blocks are split, redistributed or concatenated (and an block is eliminated).

·       The index set (the actual B-Tree) is maintained as described in Chapter 8.

 

Index Set Block Size

 

·       Normally the index set block size is the same as the sequence set block size because:

·       The block size for the sequence set is usually related to disk drive characteristics and available RAM.  These same factors are used to control block sizes for index sets.

·       A common block size makes virtual buffering easier and more efficient.

·       Often the index set and sequence set are co-mingled in the same file to avoid seeks.  The file structure is usually simpler if they share a common block size.

 

Internal Structure of Index Set Blocks:    A Variable-order B-Tree

 

·       Since the separators are in general of differing lengths, the B-Tree containing the separators must accommodate variable length records.

·       If the block containing the separators is large, a binary search through the separators will be much more efficient than a sequential search.

·       A binary search of a list of variable length entries can only be performed if there is a fixed length list of pointers to the variable length separators.

·      
A possible node structure that will accommodate a binary search is:

 


Loading a Simple Prefix B+Tree

 

·       Loading with randomly sorted data will result in much overhead in splitting,

·       Loading with sorted data can be very fast if the load routine compromises on the requirement of  a B-Tree that requires every node to be at least half full.

·       The index set node in the B-Tree with the largest keys can contain no separators at certain times and thus can grow to capacity before being promoted (more up) in the B-Tree.

·       Advantages:

·       The output can be written sequentially.

·       No blocks need to be reorganized as we proceed.

·       No blocks are physically out-of-order unless the clusters are out-of-order (spatial locality).

·       We make only one pass over the data, rather than the many passes associated with random order insertions

·       All sequence set blocks can be 100% full (except possibly the last one).

·       We can intentionally control the density of sequence set block depending on the expected volatility of the data.

 

B+Trees

 

·       B+Trees use the entire key in the index sets while Simple Prefix B+Trees use separators as entries in the index set.

·       Advantages of Simple Prefix B+Trees over B+Trees are:

·       The index set is shallower (the height of the tree is less) and searches require fewer seeks.

·       Less disk space is required – especially if keys are very large.

·       Advantages of B+Trees over Simple Prefix B+Trees are:

·       Overhead is required to maintain and use variable-length structures.

·       Some key sets don’t lend themselves to much compression and would require more overhead with little payoff.

 

B-Trees, B+Trees, and Simple Prefix B+Trees in Perspective

 

·       Don’t use any of these if you can get the entire index in RAM.  Use a binary search in RAM if possible.

·       Hashing (next chapter) is a better alternative in some specialized applications.

·       All three have the following characteristics:

·       They are all paged index structures, which means that they bring entire blocks of information into RAM at once.  As a consequence, it is possible to choose between a great many alternatives (e.g., the keys for hundreds of thousand of records) with just a few seeks out to disk storage.  The shape of these trees tends to be broad and shallow.

·       All Three approaches maintain height-balanced trees.  The trees do not grow in an uneven way, which would result in some potentially long searches or certain keys.

·       In all cases the trees grow from the bottom up. 

·       With all three structures it is possible to obtain greater storage efficiency through the use of two-tot-three splitting and of redistribution in place of block splitting when possible.

·       All three approaches can be implemented as virtual tree structures in which the most recently used blocks are held in RAM.

·       Any of these approaches can be adapted for use with variable-length records.

·       Specific characteristics of each of the three approaches can be found earlier in these notes and at the end of Chapter 9 in the text.