Data Structure and Algorithms - B+ Trees


The B+ trees are extensions of B trees designed to make the insertion, deletion and searching operations more efficient.

The properties of B+ trees are similar to the properties of B trees, except that the B trees can store keys and records in all internal nodes and leaf nodes while B+ trees store records in leaf nodes and keys in internal nodes. One profound property of the B+ tree is that all the leaf nodes are connected to each other in a single linked list format and a data pointer is available to point to the data present in disk file. This helps fetch the records in equal numbers of disk access.

Since the size of main memory is limited, B+ trees act as the data storage for the records that couldn’t be stored in the main memory. For this, the internal nodes are stored in the main memory and the leaf nodes are stored in the secondary memory storage.

b plus tree

Properties of B+ trees

  • Every node in a B+ Tree, except root, will hold a maximum of m children and (m-1) keys, and a minimum of $\mathrm{\left \lceil \frac{m}{2}\right \rceil}$ children and $\mathrm{\left \lceil \frac{m-1}{2}\right \rceil}$ keys, since the order of the tree is m.

  • The root node must have no less than two children and at least one search key.

  • All the paths in a B tree must end at the same level, i.e. the leaf nodes must be at the same level.

  • A B+ tree always maintains sorted data.

Basic Operations of B+ Trees

The operations supported in B+ trees are Insertion, deletion and searching with the time complexity of O(log n) for every operation.

They are almost similar to the B tree operations as the base idea to store data in both data structures is same. However, the difference occurs as the data is stored only in the leaf nodes of a B+ trees, unlike B trees.

Insertion

The insertion to a B+ tree starts at a leaf node.

Step 1 − Calculate the maximum and minimum number of keys to be added onto the B+ tree node.

calculate number of keys

Step 2 − Insert the elements one by one accordingly into a leaf node until it exceeds the maximum key number.

Insert elements

Step 3 − The node is split into half where the left child consists of minimum number of keys and the remaining keys are stored in the right child.

node split

Step 4 − But if the internal node also exceeds the maximum key property, the node is split in half where the left child consists of the minimum keys and remaining keys are stored in the right child. However, the smallest number in the right child is made the parent.

insert into b plus tree

Step 5 − If both the leaf node and internal node have the maximum keys, both of them are split in the similar manner and the smallest key in the right child is added to the parent node.

b_plus_tree_order_4

Deletion

The deletion operation in a B+ tree, we need to consider the redundancy in the data structure.

Case 1 − If the key is present in a leaf node which has more than minimal number of keys, without its copy present in the internal nodes, simple delete it.

delete 7 deleted 7

Case 2 − If the key is present in a leaf node with exactly minimal number of keys and a copy of it is not present in the internal nodes, borrow a key from its sibling node and delete the desired key.

Case 3 − If the key present in the leaf node has its copy in the internal nodes, there are multiple scenarios possible −

  • More than minimal keys present in both leaf node and internal node: simply delete the key and add the inorder successor to the internal node only.

delete 5 deleted 5
  • Exact minimum number of keys present in the leaf node: delete the node and borrow a node from its immediate sibling and replace its value in internal node as well.

delete 4 deleted 4
  • If the copy of the leaf node to be delete is in its grandparent, delete the node and remove the empty space. The grandparent is filled with the inorder successor of the deleted node.

delete 3

Case 4 − If the key to be deleted is in a node violating the minimum keys property, both its parent and sibling have minimum number of keys, delete the key and merge its sibling with its parent.

deleted 3

C++ Implementation

Following is the C++ implementation of B+ Trees −

#include<iostream>
using namespace std;
struct BplusTree {
   int *d;
   BplusTree **child_ptr;
   bool l;
   int n;
}
*r = NULL, *np = NULL, *x = NULL;
BplusTree* init() { //to create nodes 
   int i;
   np = new BplusTree;
   np->d = new int[6];//order 6
   np->child_ptr = new BplusTree *[7];
   np->l = true;
   np->n = 0;
   for (i = 0; i < 7; i++) {
      np->child_ptr[i] = NULL;
   }
   return np;
}

void traverse(BplusTree *p) { //traverse tree 
   cout<<endl;
   int i;
   for (i = 0; i < p->n; i++) {
      if (p->l == false) {
         traverse(p->child_ptr[i]);
      }
      cout << " " << p->d[i];
   }
   if (p->l == false) {
      traverse(p->child_ptr[i]);
   }
   cout<<endl;
}

void sort(int *p, int n) { //sort the tree 
   int i, j, t;
   for (i = 0; i < n; i++) {
      for (j = i; j <= n; j++) {
         if (p[i] >p[j]) {
            t = p[i];
            p[i] = p[j];
            p[j] = t;
         }
      }
   }
}

int split_child(BplusTree *x, int i) {
   int j, mid;
   BplusTree *np1, *np3, *y;
   np3 = init();
   np3->l = true;
   if (i == -1) {
      mid = x->d[2];
      x->d[2] = 0;
      x->n--;
      np1 = init();
      np1->l = false;
      x->l = true;
      for (j = 3; j < 6; j++) {
         np3->d[j - 3] = x->d[j];
         np3->child_ptr[j - 3] = x->child_ptr[j];
         np3->n++;
         x->d[j] = 0;
         x->n--;
      }
      for (j = 0; j < 6; j++) {
         x->child_ptr[j] = NULL;
      }
      np1->d[0] = mid;
      np1->child_ptr[np1->n] = x;
      np1->child_ptr[np1->n + 1] = np3;
      np1->n++;
      r = np1;
   } else {
      y = x->child_ptr[i];
      mid = y->d[2];
      y->d[2] = 0;
      y->n--;
      for (j = 3; j <6 ; j++) {
         np3->d[j - 3] = y->d[j];
         np3->n++;
         y->d[j] = 0;
         y->n--;
      }
      x->child_ptr[i + 1] = y;
      x->child_ptr[i + 1] = np3;
   }
   return mid;
}

void insert(int a) {
   int i, t;
   x = r;
   if (x == NULL) {
      r = init();
      x = r;
   } else {
      if (x->l== true && x->n == 6) {
         t = split_child(x, -1);
         x = r;
         for (i = 0; i < (x->n); i++) {
            if ((a >x->d[i]) && (a < x->d[i + 1])) {
            i++;
            break;
         } else if (a < x->d[0]) {
            break;
         } else {
            continue;
         }
      }
      x = x->child_ptr[i];
   } else {
      while (x->l == false) {
         for (i = 0; i < (x->n); i++) {
            if ((a >x->d[i]) && (a < x->d[i + 1])) {
               i++;
               break;
            } else if (a < x->d[0]) {
               break;
            } else {
               continue;
            }
         }
         if ((x->child_ptr[i])->n == 6) {
            t = split_child(x, i);
            x->d[x->n] = t;
            x->n++;
            continue;
         } else {
            x = x->child_ptr[i];
         }
      }
   }
}
   x->d[x->n] = a;
   sort(x->d, x->n);
   x->n++;
}

int main() {
   int i, n, t;
   insert(10);
   insert(20);
   insert(30);
   insert(40);
   insert(50);
   cout<<"B+ tree:\n";
   traverse(r);
}

Output

B+ tree:
10 20 30 40 50
Advertisements