CpSc 360
Chapter 8 (continued)
A
B* Tree with order m is a tree with
the following properties:
1.
Every
page has a maximum of m descendents
2.
Every
node, except for the root and the terminal nodes, has at least é(2m – 1)/3ù children
3.
The
root, unless the tree only has one node, has at least two children
4.
All
terminal nodes appear on the same level, i.e., are the same distance from the
root
5.
A
nonterminal node with K children
contains K – 1 records.
6.
A
leaf page contains at least ë(2m – 1)/3 û keys and no more than m – 1
keys
·
Keep
as many top levels of the tree in RAM as is possible.
·
Create
a page buffer to hold some number of B-tree pages. If a needed page is already in the buffer then don’t read it from
disk.
·
A
virtual B-Tree is one contained partly
in both RAM and disk storage.
·
Accessing
a page from disk that is not in a page buffer is called a page fault.
·
Two
causes of page faults are:
·
We
have never used the page
·
It
was once in the buffer but has since been replace with a new page
·
We
can’t do anything about the first cause
·
The
second cause can be minimized through buffer management.
·
The
question is: Which page should be
replaced when room for a new page is needed?
·
A
Least Recently Used algorithm (LRU) doesn’t
simply choose the oldest page for replacement but rather chooses the page whose
most recent request is oldest.
·
The
assumption is made that we are more likely to need a page that we have used
recently than we are to need a page that we have never used or one that we used
some time ago. The term for this kind
of assumption is temporal locality.
Variable-length
Records and Keys
· If we have variable length records then we cannot hold to the principle that each page has the same number of keys per page capacity.
· Rather than use a maximum and minimum number of keys per page, we need to use a maximum and minimum number of bytes.
· W might want to change the key promotion (center of TOOBIG) policy to bias towards the shortest variable-length keys rather than longer keys. The idea is that we want to have pages with the largest numbers of descendents up high in the tree, rather than at the leaf level.