MultiWay Search Trees
K08 Δομές Δεδομένων και Τεχνικές Προγραμματισμού
Κώστας Χατζηκοκολάκης
Motivation
 We keep the ordering idea of BSTs
 Fast search, by excluding whole subtrees
 And add more than two children for each node
 Gives more flexibility in restructuring the tree
 And news ways to keep it balanced
Multiway search trees

$d$node: a node with $d$ children

Each internal $d$node stores $d1$ ordered values $k_1 < \ldots < k_{d1}$
 No duplicate values in the whole tree

All values in a subtree lie inbetween the corresponding node values
 For all values $l$ in the $i$th subtree: $k_{i1} < l < k_i$
 Convention: $k_0 = \infty, k_d = +\infty$

$m$way search tree: all nodes have at most $m$ children
 A BST is a 2way search tree
Example multiway search tree
$m=3$
Searching in a multiway search tree
 Simple adaptation of the algorithm for BSTs
 Start from the root, traverse towards the leaves
 In each node, there is a single subtree that can possibly contain a value $l$
 The subtree $i$ such that $k_{i1} < l < k_i$
 Continue in that subtree
Example multiway search tree
Search for value 12
Unsuccessful search
Search for value 24
Insertion in a multiway search tree
 Again, simple adaptation of BSTs
 But: we don't always need to create a new node
 We can insert in an existing one if there is space
 Start with a search for the value $l$ we want to insert
 If found, stop (no duplicates)
 If not found, insert at the leaf we reached
 If full, create an $i$th child, such that $k_{i1} < l < k_i$
Insert value 28
$m$ = 3
Value 28 inserted
Insert value 32
Value 32 inserted
Insert value 12
Value 12 inserted
Deletion from a multiway search tree
Left as an exercise.
Complexity of operations
 We need to traverse the tree from the root to a leaf
 The time spent at each node is constant
 Eg. find $i$ such that $k_{i1} < l < k_i$
 Assuming $m$ is fixed!
 So as usual all complexities are $O(h)$
Balanced multiway search trees
 Similarly to BSTs we need to keep the tree balanced
 AVL where a kind of balanced BSTs
 We will study two kinds of balanced multiway search trees:
 23 trees
 234 trees (also known as (2,4) trees)
23 trees
 A 23 tree is a 3way search tree which has the following
properties
 Size property
 Each node contains 1 or 2 values
(so each internal node contains 2 or 3 children)
 Depth property
 All leaves have the same depth (lie on the same level)
Example of 23 tree
Height of 23 trees
 All nodes at all levels except the last one are internal
 And each internal node has at least 2 children
 So at level $i$ we have at least $2^i$ nodes
 Hence $n \ge 2^h$, in other words $h = O(\log n)$
 So we can search for an element in time $O(\log n)$
 Using the standard algorithm for $m$way trees
Search for L
Insertion in 23trees
 We can start by following the generic algorithm for $m$way trees
 Search for the value $l$ we want to insert
 If found, stop (no duplicates)
 If not found, insert at the leaf we reached
Example: insert B
Example: insert B
Example: result
Insertion in 23trees

But what if there is no space at the leaf (overflow)?

The standard algorithm will insert a child at the leaf
 But this violates the depth property!
 The new leaf is not at the same level

Different strategy
 split the overflowed node into two nodes
 pass the middle value to the parent (separator of the two nodes)

The middle value might overflow the parent
 Same procedure: split and send the middle value up
Example: insert M
Example: insert M
M overflows this node.
Example: insert M
Example: insert M
Example: insert M
Example: result
Example: insert Q
Example: insert Q
Example: result
Example: insert R
R is inserted in the node with Q where there is space.
Insertion in 23trees
 The root might also overflow
 Same procedure
 Split it
 The middle value moves up, creating a new root
 This is the only operation that increases the tree's height
 It increases the depth of all nodes simultaneously
 23trees grow at the root, not at the leaves!
Example: insert S
S overflows this node
Example: insert S
Example: insert S
Example: insert S
Example: insert S
Example: result
Complexity of insertion
 We traverse the tree
 From the root to a leaf when searching
 From the leaf back to the root while splitting
 Each split takes constant time
 We do at most $h+1$ of them
 So in total $O(h) = O(\log n)$ steps
 Recall, the tree is balanced
(2,4) trees
 A (2,4) tree (or 234 tree) is a 4way search tree with 2 extra properties
 Size property
 Each node contains between 1 and 3 values
(so each internal node contains between 2 and 4 children)
 Depth property
 All leaves have the same depth (lie on the same level)
 Such trees are balanced
 $h = O(\log n)$
 Proof: exercise
Insertion in (2,4) trees
 Same as for 23trees
 Search for the value
 Insert at a leaf
 In case of an overflow (5node)
 Split it into a 3node and a 2node
 Move the separator value $k_3$ to the parent
Overflow at a 5node
The separating value is sent to the parent node
Node replaced with a 3node and a 2node
Example: insert 4
Example: insert 6
Example: insert 12
Example: insert 15  overflow
Creation of new root node
Split
Example: insert 3
Example: insert 5  overflow
5 is sent to the parent node
Split
Example: insert 10
Example: insert 8
Example
Inserted 11, 13 and 14.
Example: insert 17  overflow
Split and send 15 to the parent node
The root overflows
Creation of new root
Split
Final tree
Complexity
 Same as for 23trees
 At most $h$ splits
 Each split is constant time
 $O(\log n)$
 Because the tree is balanced
Fixing underflows
Two strategies for fixing an underlow at $\nu$

Is there an immediate sibling $w$ with a “spare” value? (2 or 3 values)

If so, we do a transfer operation
 Move a value of $w$ to its parent $u$
 Move a value of the parent $u$ to $\nu$

If not, we do a fusion operation
 Merge $\nu$ and $w$, creating a new node $\nu'$
 Move a value from the parent $u$ to $\nu'$
 This might underflow the parent, continue the same procedure there
Initial tree
Remove 4
Transfer
After the transfer
Remove 12
Remove 12
Fusion of and
After the fusion
Remove 13
After the removal of 13
Remove 14  underflow
Fusion
Underflow at
Fusion
Remove the root
Final tree
Readings

T. A. Standish. Data Structures, Algorithms and Software Principles in
C. Section 9.9

M. T. Goodrich, R. Tamassia and D. Mount. Data Structures and
Algorithms in C++. Section 10.4

R. Sedgewick. Αλγόριθμοι σε C. 3η Αμερικανική Έκδοση. Εκδόσεις
Κλειδάριθμος. Section 13.3