data structure – Are the keys in a node of a B-tree in non-descending or ascending order?

Question:

The CRLS (Introduction to the Algorithms, Cormen et al) defines that the keys of a node in a B-tree must be in non-descending order (K1 <= K2 <= … <= Kn). However, as there is no repetition of keys, this order should be strictly increasing (so, too, assert Bayer and McGreight, in their 1972 paper). Is this right?

Answer:

The B-tree is a kind of generalization of the binary tree, where nodes are grouped in pages and each one of them can point to several others. Its properties make it ideal for working with disks, as it minimizes the most costly operations (rotating the disk and moving the head, that is, starting reading a block).

When Bayer and McGreight wrote their paper they weren't interested in what types of data would be stored by these trees. The objective was to define the tree theoretically, show access, write and delete times and present the important calculations and algorithms. But after 1972 it began to be used and over the years the term was being applied a little outside the original definition.

B-trees are widely used in databases and other ways of storing large amounts of data on a disk. In some circumstances having repeated keys can be an advantage as the time to clear them can be very high. But this causes several problems, especially in the deletion.

Imagine that the same key is repeated several times and you want to delete one of the elements that was stored with that key. You'll have to walk through every occurrence of that key until you find the specific element, which will be pretty inefficient.

Usually when a B-tree contains repeated keys one more attribute is added to make a composite key so that it is unique. Thus, some operations may use the normal key and others the composite key. Another solution is to maintain a separate index that manages all repeating keys.

Anyway, in the article it was better to consider that the keys are unique, because that way the whole theory would be more elegant, treating cases of repeated keys would not be interesting. But in practice there are cases where it is better to use a B-tree with non-unique keys allowed and solve the resulting problems in other ways. That's not why people will stop calling it B-tree.

Therefore, Cormem's book preferred to define the B-tree that way, not to exclude those cases where the key is not unique and that can be used in practice.

Scroll to Top
AllEscort