Unit VI: Trees

Data Structures and Algorithms — Complete Chapter Notes (Java)

Duration: 7 Hours

Lesson Plan (7 Hours)
6.1 Introduction
6.2 Tree Terminology
6.3 Binary Tree
- Strictly Binary Tree
- Complete Binary Tree
- Extended Binary Tree
6.4 Binary Tree Representation
- Array Representation
- Linked List Representation
6.5 Create a Binary Tree
6.6 Traversal of a Binary Tree
- Preorder, Inorder, Postorder (+ Level order)
6.7 Binary Search Tree (BST)
- Insertion, Search, Deletion
6.8 Tree Height, Level and Depth
6.9 Balanced Tree: AVL Balanced Trees
6.10 The Huffman Algorithm
6.11 B-Tree
Quick Revision & Exam Pointers

6.1 Introduction

So far we have studied linear data structures — arrays, stacks, queues, linked lists — where elements are arranged one after another in a single sequence. In a linear structure each element has exactly one predecessor and one successor (except the ends).

A tree is a non-linear, hierarchical data structure. Instead of a straight line, data branches out like an upside-down real tree: the root is at the top and leaves at the bottom.

            (root)
            /  |  \
          o    o    o      <- branches
         / \        |
        o   o       o      <- leaves

Why do we need trees?

Hierarchy is natural. File systems (folders within folders), organization charts, family trees, HTML/XML documents, and the Java class hierarchy are all trees.
Fast searching. A balanced Binary Search Tree searches n items in O(log n) time, far better than O(n) of a linked list.
Efficient sorting & priority handling. Heaps (a kind of tree) power priority queues and Heap Sort.
Compression. Huffman coding (Section 6.10) uses a tree to compress data.
Databases & file systems. B-Trees and B+ Trees (Section 6.11) index huge datasets on disk.

Formal definition

A tree is a finite set T of one or more nodes such that:

There is one specially designated node called the root.
The remaining nodes are partitioned into n ≥ 0 disjoint sets T₁, T₂, …, Tₙ, where each Tᵢ is itself a tree. These are the subtrees of the root.

This definition is recursive — a tree is made of smaller trees. That is why almost every tree algorithm is naturally recursive.

6.2 Tree Terminology

Consider this tree. We will refer to it throughout the section:

                 A            <- Level 0  (root)
               / | \
              B  C  D          <- Level 1
             / \     \
            E   F     G        <- Level 2
               /
              H                <- Level 3

Term	Definition	Example (above)
Node	A basic unit holding data + links to children	A, B, C … H
Root	The topmost node, has no parent	A
Edge	A link/line connecting a parent to a child	A–B, F–H
Parent	A node having one or more children	B is parent of E, F
Child	A node directly below another	E, F are children of B
Siblings	Nodes that share the same parent	B, C, D are siblings
Leaf (terminal/external)	A node with no children	E, H, C, G
Internal (non-terminal) node	A node with at least one child	A, B, D, F
Degree of a node	Number of children of that node	degree(A)=3, degree(B)=2, degree(E)=0
Degree of a tree	Maximum degree among all its nodes	3 (because of A)
Path	A sequence of nodes connected by edges	A → B → F → H
Ancestor	Any node on the path from root to that node	A, B, F are ancestors of H
Descendant	Any node in the subtree below a node	E, F, H are descendants of B
Level	Distance (in edges) from the root; root is level 0	H is at level 3
Depth of a node	Number of edges from the root down to the node	depth(H)=3
Height of a node	Number of edges on the longest path from the node down to a leaf	height(B)=2 (B→F→H)
Height of the tree	Height of the root	3
Subtree	A node together with all its descendants	B,E,F,H form a subtree
Forest	A set of zero or more disjoint trees	remove A → {B-subtree, C, D-subtree}

Depth vs Height (common exam trap):

Depth is measured downward from the root to a node.

Height is measured upward from the deepest leaf to a node.

For the whole tree, height = depth of the deepest node.

6.3 Binary Tree

A Binary Tree is a tree in which every node has at most two children, referred to as the left child and the right child.

So the degree of any node is 0, 1, or 2.

Here A has a left child B and right child C, and so on.

Properties of a binary tree

For a binary tree:

Maximum number of nodes at level L = 2^L (level starts at 0).
Maximum number of nodes in a tree of height h = 2^(h+1) − 1.
Minimum height of a binary tree with n nodes = ⌊log₂ n⌋.
A binary tree with n nodes has exactly n − 1 edges.
If a binary tree has n₀ leaf nodes and n₂ nodes of degree 2, then n₀ = n₂ + 1.

Java node definition (used everywhere below):

// A single node of a binary tree
class Node {
    int data;          // the value stored
    Node left;         // reference to left child  (null if none)
    Node right;        // reference to right child (null if none)

    Node(int data) {
        this.data = data;
        this.left = null;
        this.right = null;
    }
}

▪ Strictly Binary Tree (Full / Proper Binary Tree)

A binary tree is strictly binary if every node has either 0 or 2 children — no node has exactly one child.

A has 2 children, B has 2 children, C, D, E have 0 children. ✔ Strictly binary.

B has only one child (D). ✘ NOT strictly binary.

Property: A strictly binary tree with n leaf nodes always has 2n − 1 total nodes.

▪ Complete Binary Tree

A binary tree is complete if all levels are completely filled except possibly the last level, and the last level has all its nodes as far left as possible.

            A                 Level 0: full
          /   \
         B     C              Level 1: full
        / \   /
       D   E F                Level 2: filled left-to-right, no gaps

This is complete: the last level fills left first, with no holes.

            A
          /   \
         B     C
        / \     \
       D   E     F            <- gap before F (C has no left child)

This is NOT complete — there is a missing left child of C while a right child exists.

Why it matters: Complete binary trees can be stored compactly in an array with no wasted slots (this is exactly how a binary heap works). For a node at index i:

left child = 2i + 1
right child = 2i + 2
parent = (i − 1) / 2

Perfect Binary Tree (bonus term): all internal nodes have 2 children and all leaves are at the same level. A perfect tree of height h has exactly 2^(h+1) − 1 nodes. Every perfect tree is also complete and strictly binary.

▪ Extended Binary Tree (2-Tree)

When we convert an ordinary binary tree into a strictly binary tree by adding special "external" nodes (usually drawn as squares ▢) wherever a child is missing, we get an extended binary tree.

Original nodes become internal nodes (circles ○).
Added dummy nodes become external nodes (squares ▢) — they have no children.

Before extension:

After extension (▢ = added external node):

            A
          /   \
         B     C
        / \   / \
       ▢   D ▢   ▢
          / \
         ▢   ▢

Properties of extended binary tree:

Every internal node has exactly 2 children → it is strictly binary.
If there are n internal nodes, there are exactly n + 1 external nodes.
Used to compute path length: Internal Path Length (I) and External Path Length (E) are related by E = I + 2n where n = number of internal nodes. This is central to analyzing Huffman trees and search costs.

6.4 Binary Tree Representation

A binary tree can be stored in memory in two ways: array (sequential) representation and linked list representation.

▪ Array Representation of a Binary Tree

We number the nodes level by level, left to right, starting the root at index 1 (using index 0 is also possible — formulas shift slightly). Then:

For a node at index i:
- Left child is at index 2i
- Right child is at index 2i + 1
- Parent is at index i / 2 (integer division)

Example tree:

              A(1)
            /      \
         B(2)      C(3)
         /  \         \
      D(4)  E(5)      G(7)

Array (1-indexed):

Index :  0    1    2    3    4    5    6    7
Value : [ -    A    B    C    D    E    -    G ]

Index 6 is empty because C has no left child — the slot is still reserved.

Java — array-based binary tree:

public class ArrayBinaryTree {
    private char[] tree;       // index 0 is unused
    private int maxSize;

    public ArrayBinaryTree(int height) {
        // a tree of height h needs 2^(h+1) - 1 nodes; +1 because we skip index 0
        maxSize = (int) Math.pow(2, height + 1);
        tree = new char[maxSize];
    }

    public void setRoot(char value) { tree[1] = value; }

    public void setLeft(char value, int parentIndex)  { tree[2 * parentIndex] = value; }
    public void setRight(char value, int parentIndex) { tree[2 * parentIndex + 1] = value; }

    public char getLeft(int i)   { return tree[2 * i]; }
    public char getRight(int i)  { return tree[2 * i + 1]; }
    public char getParent(int i) { return tree[i / 2]; }

    public void printInorder(int i) {
        if (i >= maxSize || tree[i] == '\u0000') return; // empty slot
        printInorder(2 * i);                 // left
        System.out.print(tree[i] + " ");     // node
        printInorder(2 * i + 1);             // right
    }

    public static void main(String[] args) {
        ArrayBinaryTree t = new ArrayBinaryTree(2);
        t.setRoot('A');
        t.setLeft('B', 1);  t.setRight('C', 1);
        t.setLeft('D', 2);  t.setRight('E', 2);
        System.out.print("Inorder: ");
        t.printInorder(1);   // Output: D B E A C
    }
}

Advantages of array representation

No pointers needed; simple index math.
Direct random access to parent/child via formulas.
Perfect for complete trees and heaps (no wasted space).

Disadvantages

Huge memory waste for skewed / sparse trees (a right-skewed tree of n nodes needs an array of size 2ⁿ!).
Fixed size — hard to grow; insertion/deletion of arbitrary nodes is costly.

▪ Linked List Representation of a Binary Tree

Each node is an object holding data plus two references (left and right). A missing child is simply null. The tree is reached through a single root reference.

        ┌──────┬──────┬──────┐
 root ->│ left │ data │ right │
        └──┬───┴──────┴───┬───┘
           v              v
       (left child)   (right child)

For the tree

the memory picture is:

 root → [•|A|•]
          /   \
   [•|B|•]     [∅|C|∅]
     /  \
[∅|D|∅] ∅

(• = a reference, ∅ = null)

Java — linked representation:

class Node {
    char data;
    Node left, right;
    Node(char data) { this.data = data; }
}

public class LinkedBinaryTree {
    Node root;

    public static void main(String[] args) {
        LinkedBinaryTree t = new LinkedBinaryTree();
        t.root        = new Node('A');
        t.root.left   = new Node('B');
        t.root.right  = new Node('C');
        t.root.left.left = new Node('D');
        // C and D's children stay null automatically
    }
}

Advantages of linked representation

No wasted memory for missing children (just null).
Easy to grow; insertion/deletion only rewires a few references.
Natural fit for recursion.

Disadvantages

Extra memory for two reference fields per node.
No direct access to the parent (unless we also store a parent pointer).
To reach a node you must traverse from the root.

Feature	Array Representation	Linked Representation
Memory for sparse trees	Wasteful	Efficient
Access parent/child	O(1) via index math	Need pointers / traversal
Insertion/Deletion	Costly (shifting)	Cheap (rewire pointers)
Best for	Complete trees, heaps	General / dynamic trees
Size	Fixed	Dynamic

6.5 Create a Binary Tree

"Creating" a binary tree means allocating nodes and wiring up the left/right links. We show two common ways.

Method 1 — Manual construction

class Node {
    int data;
    Node left, right;
    Node(int data) { this.data = data; }
}

public class CreateTree {
    public static void main(String[] args) {
        /* Build this tree:
                    1
                   / \
                  2   3
                 / \
                4   5
        */
        Node root      = new Node(1);
        root.left      = new Node(2);
        root.right     = new Node(3);
        root.left.left = new Node(4);
        root.left.right= new Node(5);
        System.out.println("Tree created. Root = " + root.data);
    }
}

Method 2 — Build by inserting into a queue (level-order build)

This is how you build a tree from a sequence such as 1 2 3 4 5 6 so that it becomes a complete binary tree (each new node fills the next left-to-right slot).

import java.util.*;

class Node {
    int data; Node left, right;
    Node(int d){ data = d; }
}

public class BuildLevelOrder {
    static Node insertLevelOrder(int[] arr) {
        if (arr.length == 0) return null;
        Node root = new Node(arr[0]);
        Queue<Node> q = new LinkedList<>();
        q.add(root);
        int i = 1;
        while (i < arr.length) {
            Node cur = q.poll();
            if (i < arr.length) {                 // attach left child
                cur.left = new Node(arr[i++]);
                q.add(cur.left);
            }
            if (i < arr.length) {                 // attach right child
                cur.right = new Node(arr[i++]);
                q.add(cur.right);
            }
        }
        return root;
    }

    public static void main(String[] args) {
        int[] arr = {1, 2, 3, 4, 5, 6};
        Node root = insertLevelOrder(arr);
        // Produces:
        //        1
        //       / \
        //      2   3
        //     / \  /
        //    4  5 6
        System.out.println("Built. Root = " + root.data);
    }
}

Algorithm (level-order build):

1. If array empty, return null.
2. Create root from arr[0]; enqueue root; set i = 1.
3. While i < length:
      cur = dequeue()
      if i < length: cur.left  = node(arr[i]); enqueue(cur.left);  i++
      if i < length: cur.right = node(arr[i]); enqueue(cur.right); i++
4. Return root.

6.6 Traversal of a Binary Tree

Traversal = visiting every node of the tree exactly once in some order. For binary trees there are three classic depth-first orders plus one breadth-first order.

The three depth-first orders differ only in WHEN we process the current node (Root) relative to its left (L) and right (R) subtrees:

Traversal	Order	Mnemonic
Preorder	Root → Left → Right	process root first
Inorder	Left → Root → Right	root in the middle
Postorder	Left → Right → Root	root last

We use this sample tree for all traces:

                 A
               /   \
              B     C
             / \     \
            D   E     F
                     /
                    G

▪ Preorder Traversal (Root, Left, Right)

Algorithm:

preorder(node):
    if node == null: return
    visit(node)            // 1. process root
    preorder(node.left)    // 2. go left
    preorder(node.right)   // 3. go right

Java:

void preorder(Node n) {
    if (n == null) return;
    System.out.print(n.data + " ");   // Root
    preorder(n.left);                  // Left
    preorder(n.right);                 // Right
}

Dry run / tracing (call stack shown):

preorder(A) -> print A | go left
  preorder(B) -> print B | go left
    preorder(D) -> print D | left null | right null  (return)
    preorder(E) -> print E | left null | right null  (return)
  (back to B done) go right of A
  preorder(C) -> print C | left null | go right
    preorder(F) -> print F | go left
      preorder(G) -> print G | (return)

Output: A B D E C F G

Use: Preorder produces a copy of the tree / prefix expression of an expression tree; it visits a node before its descendants, useful for making a copy or serializing a tree.

▪ Inorder Traversal (Left, Root, Right)

Algorithm:

inorder(node):
    if node == null: return
    inorder(node.left)     // 1. left subtree
    visit(node)            // 2. process root
    inorder(node.right)    // 3. right subtree

Java:

void inorder(Node n) {
    if (n == null) return;
    inorder(n.left);                   // Left
    System.out.print(n.data + " ");    // Root
    inorder(n.right);                  // Right
}

Tracing on the sample tree:

inorder(A)
  inorder(B)
    inorder(D) -> left null, print D, right null
    print B
    inorder(E) -> left null, print E, right null
  print A
  inorder(C)
    left null
    print C
    inorder(F)
      inorder(G) -> print G
      print F

Output: D B E A C G F

Use (very important): Inorder traversal of a Binary Search Tree visits nodes in sorted ascending order. This is the single most exam-relevant fact about traversals.

▪ Postorder Traversal (Left, Right, Root)

Algorithm:

postorder(node):
    if node == null: return
    postorder(node.left)   // 1. left subtree
    postorder(node.right)  // 2. right subtree
    visit(node)            // 3. process root last

Java:

void postorder(Node n) {
    if (n == null) return;
    postorder(n.left);                 // Left
    postorder(n.right);                // Right
    System.out.print(n.data + " ");    // Root
}

Tracing on the sample tree:

postorder(A)
  postorder(B)
    postorder(D) -> print D
    postorder(E) -> print E
    print B
  postorder(C)
    postorder(F)
      postorder(G) -> print G
      print F
    print C
  print A

Output: D E B G F C A

Use: Postorder is used to delete/free a tree (children freed before parent) and to evaluate postfix expressions from an expression tree.

▪ Level-order Traversal (Breadth-First) — bonus

Visit nodes level by level using a queue.

import java.util.*;
void levelOrder(Node root) {
    if (root == null) return;
    Queue<Node> q = new LinkedList<>();
    q.add(root);
    while (!q.isEmpty()) {
        Node cur = q.poll();
        System.out.print(cur.data + " ");
        if (cur.left  != null) q.add(cur.left);
        if (cur.right != null) q.add(cur.right);
    }
}

Output for sample tree: A B C D E F G

Summary of all four on the sample tree

Traversal	Result
Preorder	A B D E C F G
Inorder	D B E A C G F
Postorder	D E B G F C A
Level-order	A B C D E F G

Iterative inorder (using an explicit stack)

Recursion uses the call stack internally. We can do it manually:

import java.util.*;
void inorderIterative(Node root) {
    Stack<Node> stack = new Stack<>();
    Node cur = root;
    while (cur != null || !stack.isEmpty()) {
        while (cur != null) {        // go as far left as possible
            stack.push(cur);
            cur = cur.left;
        }
        cur = stack.pop();           // process
        System.out.print(cur.data + " ");
        cur = cur.right;             // then go right
    }
}

Reconstructing a tree from traversals (classic exam question)

Inorder + Preorder uniquely determines a tree.
Inorder + Postorder uniquely determines a tree.
Preorder + Postorder alone do NOT (ambiguous), unless the tree is full.

Example: Given Inorder = D B E A C and Preorder = A B D E C:

Preorder's first element A is the root.
In inorder, everything left of A (D B E) is the left subtree, right of A (C) is the right subtree.
Recurse: next preorder element B is root of left subtree; in inorder D is left of B, E is right of B.
Result:

6.7 Binary Search Tree (BST)

A Binary Search Tree is a binary tree that satisfies the BST property at every node:

Left subtree values < node value < Right subtree values (and both subtrees are themselves BSTs, with no duplicate keys in the basic version).

Example BST:

                 50
               /    \
             30      70
            /  \    /  \
          20   40  60   80

Everything left of 50 (30,20,40) is < 50; everything right (70,60,80) is > 50. ✔
This holds recursively at 30, 70, etc.

Key consequence: An inorder traversal of a BST yields sorted order: 20 30 40 50 60 70 80.

Why BST? Searching is like binary search on an array, but the structure is dynamic:

Average case search/insert/delete: O(log n) (balanced).
Worst case (skewed tree, e.g. inserting sorted data): O(n) — this is why AVL trees in 6.9 exist.

Java node + class skeleton:

class Node {
    int key;
    Node left, right;
    Node(int key) { this.key = key; }
}

public class BST {
    Node root;
}

▪ Insertion

Idea: Start at the root. If the new key is smaller, go left; if larger, go right. When you reach a null link, place the new node there.

Algorithm:

insert(node, key):
    if node == null:            return new Node(key)   // found the spot
    if key < node.key:          node.left  = insert(node.left, key)
    else if key > node.key:     node.right = insert(node.right, key)
    // key == node.key -> duplicate, ignore (or handle as you wish)
    return node

Java:

Node insert(Node node, int key) {
    if (node == null) return new Node(key);
    if (key < node.key)       node.left  = insert(node.left, key);
    else if (key > node.key)  node.right = insert(node.right, key);
    return node;                 // unchanged node pointer goes back up
}
void insert(int key) { root = insert(root, key); }

Tracing — insert sequence 50, 30, 70, 20, 40, 60:

Insert 50: tree empty -> 50 becomes root
        50

Insert 30: 30 < 50 -> go left (null) -> place
        50
       /
      30

Insert 70: 70 > 50 -> go right (null) -> place
        50
       /  \
      30    70

Insert 20: 20 < 50 left -> 20 < 30 left (null) -> place
        50
       /  \
      30    70
     /
    20

Insert 40: 40 < 50 left -> 40 > 30 right (null) -> place
        50
       /  \
      30    70
     /  \
    20   40

Insert 60: 60 > 50 right -> 60 < 70 left (null) -> place
        50
       /  \
      30    70
     /  \   /
    20   40 60

▪ Search

Algorithm:

search(node, key):
    if node == null:        return false   // not found
    if key == node.key:     return true    // found
    if key < node.key:      return search(node.left, key)
    else:                   return search(node.right, key)

Java (recursive + iterative):

boolean search(Node node, int key) {
    if (node == null) return false;
    if (key == node.key) return true;
    return key < node.key ? search(node.left, key)
                          : search(node.right, key);
}

boolean searchIterative(int key) {
    Node cur = root;
    while (cur != null) {
        if (key == cur.key) return true;
        cur = (key < cur.key) ? cur.left : cur.right;
    }
    return false;
}

Tracing — search 40 in the tree above:

40 vs 50 -> 40 < 50 -> go left to 30
40 vs 30 -> 40 > 30 -> go right to 40
40 vs 40 -> equal -> FOUND  (3 comparisons)

Tracing — search 45:

45 vs 50 -> left to 30
45 vs 30 -> right to 40
45 vs 40 -> right to null -> NOT FOUND

▪ Deletion

Deletion has three cases, in increasing difficulty.

Case 1 — Node is a leaf (no children): simply remove it (set parent link to null).

Delete 20:
     30            30
    /     -->        \
  20  40             40

Case 2 — Node has one child: replace the node with its single child.

Delete 30 (it has one child 40 after 20 removed):
     50               50
    /        -->     /
  30               40
     \
      40

Case 3 — Node has two children: This is the tricky one.

Find the node's inorder successor = smallest value in its right subtree (go right once, then keep going left). (Equivalently, use the inorder predecessor = largest in left subtree.)
Copy the successor's value into the node.
Delete the successor from the right subtree (it has at most one child, so it reduces to Case 1 or 2).

Delete 50 (two children) from:
            50                         60
          /    \                     /    \
        30      70      ===>       30      70
       /  \    /  \               /  \       \
      20  40  60   80            20  40       80

Inorder successor of 50 = 60 (smallest in right subtree).
Copy 60 into root, then delete original 60 (a leaf).

Algorithm:

delete(node, key):
    if node == null: return null
    if key < node.key:        node.left  = delete(node.left, key)
    else if key > node.key:   node.right = delete(node.right, key)
    else:                      // found the node
        if node.left == null:  return node.right     // Case 1/2
        if node.right == null: return node.left       // Case 2
        // Case 3: two children
        successor = minValue(node.right)
        node.key  = successor
        node.right = delete(node.right, successor)
    return node

minValue(node):
    while node.left != null: node = node.left
    return node.key

Java:

Node delete(Node node, int key) {
    if (node == null) return null;

    if (key < node.key) {
        node.left = delete(node.left, key);
    } else if (key > node.key) {
        node.right = delete(node.right, key);
    } else {
        // node to be deleted found
        if (node.left == null)  return node.right;   // 0 or 1 (right) child
        if (node.right == null) return node.left;    // 1 (left) child

        // two children: get inorder successor (smallest in right subtree)
        int successor = minValue(node.right);
        node.key = successor;                        // copy value
        node.right = delete(node.right, successor);  // delete successor
    }
    return node;
}

int minValue(Node node) {
    int min = node.key;
    while (node.left != null) { node = node.left; min = node.key; }
    return min;
}
void delete(int key) { root = delete(root, key); }

Complete runnable BST program

public class BSTDemo {
    static class Node {
        int key; Node left, right;
        Node(int k){ key = k; }
    }
    Node root;

    void insert(int key){ root = insert(root, key); }
    Node insert(Node n, int key){
        if (n == null) return new Node(key);
        if (key < n.key) n.left = insert(n.left, key);
        else if (key > n.key) n.right = insert(n.right, key);
        return n;
    }

    boolean search(int key){
        Node c = root;
        while (c != null){
            if (key == c.key) return true;
            c = key < c.key ? c.left : c.right;
        }
        return false;
    }

    void delete(int key){ root = delete(root, key); }
    Node delete(Node n, int key){
        if (n == null) return null;
        if (key < n.key) n.left = delete(n.left, key);
        else if (key > n.key) n.right = delete(n.right, key);
        else {
            if (n.left == null) return n.right;
            if (n.right == null) return n.left;
            int s = minValue(n.right);
            n.key = s;
            n.right = delete(n.right, s);
        }
        return n;
    }
    int minValue(Node n){ int m = n.key; while(n.left!=null){n=n.left; m=n.key;} return m; }

    void inorder(Node n){ if(n==null)return; inorder(n.left); System.out.print(n.key+" "); inorder(n.right); }

    public static void main(String[] args){
        BSTDemo t = new BSTDemo();
        int[] data = {50,30,70,20,40,60,80};
        for (int x : data) t.insert(x);

        System.out.print("Inorder (sorted): "); t.inorder(t.root); System.out.println();
        System.out.println("Search 40: " + t.search(40));   // true
        System.out.println("Search 45: " + t.search(45));   // false

        t.delete(50);   // delete root (two children)
        System.out.print("After deleting 50: "); t.inorder(t.root); System.out.println();
    }
}
/* Expected output:
   Inorder (sorted): 20 30 40 50 60 70 80
   Search 40: true
   Search 45: false
   After deleting 50: 20 30 40 60 70 80
*/

Operation	Average	Worst (skewed)
Search	O(log n)	O(n)
Insert	O(log n)	O(n)
Delete	O(log n)	O(n)

6.8 Tree Height, Level and Depth

These three measures are often confused. Use this tree:

                 A          <- Level 0 / Depth 0
               /   \
              B     C       <- Level 1 / Depth 1
             / \
            D   E           <- Level 2 / Depth 2
               /
              F             <- Level 3 / Depth 3

Level of a node = number of edges from the root to that node. Root is level 0.
- level(A)=0, level(B)=1, level(D)=2, level(F)=3.
Depth of a node = same as level — edges from the root down to the node.
- depth(F)=3.
Height of a node = number of edges on the longest downward path from the node to a leaf.
- height(F)=0 (it is a leaf), height(B)=2 (B→E→F), height(A)=3.
Height of the tree = height of the root = 3 here.
Depth of the tree = max depth of any node = 3 (same number; just measured from opposite ends).

Memory hook: Depth counts down from the top; Height counts up from the bottom. For a single empty tree height is usually defined as −1, and a single node has height 0.

Java — height (recursive):

int height(Node n) {
    if (n == null) return -1;                  // empty tree: -1 (some texts use 0)
    int lh = height(n.left);
    int rh = height(n.right);
    return 1 + Math.max(lh, rh);
}

Tracing height(A):

height(A) = 1 + max(height(B), height(C))
height(B) = 1 + max(height(D), height(E))
height(D) = 1 + max(-1,-1) = 0
height(E) = 1 + max(height(F), -1)
height(F) = 1 + max(-1,-1) = 0
height(E) = 1 + max(0,-1) = 1
height(B) = 1 + max(0,1)  = 2
height(C) = 1 + max(-1,-1)= 0
height(A) = 1 + max(2,0)  = 3   ✔

Java — depth of a target node (edges from root):

int depth(Node n, int key, int d) {
    if (n == null) return -1;
    if (n.key == key) return d;
    int left = depth(n.left, key, d + 1);
    if (left != -1) return left;
    return depth(n.right, key, d + 1);
}
// call: depth(root, key, 0)

Useful relations for n nodes:

Maximum nodes at level L = 2^L.
A tree of height h has at most 2^(h+1) − 1 nodes.
Minimum possible height for n nodes = ⌈log₂(n+1)⌉ − 1 (i.e. a balanced tree); maximum = n − 1 (a skewed tree).

6.9 Balanced Tree: AVL Balanced Trees

The problem with plain BSTs

If we insert sorted data 10, 20, 30, 40, 50 into a BST, it degenerates into a right-skewed list:

Search now costs O(n) — no better than a linked list. We need to keep the tree balanced.

What is an AVL Tree?

An AVL tree (named after inventors Adelson-Velsky and Landis, 1962) is a self-balancing BST. For every node it maintains the Balance Factor (BF) invariant:

BF(node) = height(left subtree) − height(right subtree), and BF ∈ {−1, 0, +1}.

If after an insertion or deletion any node's BF becomes +2 or −2, the tree is rebalanced by rotations.

   Balanced (BF shown)             Unbalanced (BF = +2 at 30)
        20(0)                          30(+2)
       /    \                          /
    10(0)   30(0)                   20(+1)
                                     /
                                  10(0)

The four rotation cases

There are exactly four imbalance patterns, named by the path from the unbalanced node to the newly inserted node.

1. Left-Left (LL) → single Right rotation

Inserted into the left subtree of the left child. BF of node = +2, BF of its left child = +1.

      z(+2)                          y(0)
     /                              /   \
    y(+1)        Right Rotate     x      z
   /             ----------->
  x

Right rotation about z: y becomes the new root, z becomes y's right child, and y's old right subtree (T2) becomes z's left child.

Node rightRotate(Node z) {
    Node y = z.left;
    Node T2 = y.right;
    y.right = z;          // rotate
    z.left  = T2;         // hand over the orphaned subtree
    updateHeight(z);      // update z first (now lower)
    updateHeight(y);
    return y;             // new subtree root
}

2. Right-Right (RR) → single Left rotation

Inserted into the right subtree of the right child. Mirror of LL.

   z(-2)                              y(0)
      \                              /   \
       y(-1)       Left Rotate      z     x
        \          ---------->
         x

Node leftRotate(Node z) {
    Node y = z.right;
    Node T2 = y.left;
    y.left  = z;
    z.right = T2;
    updateHeight(z);
    updateHeight(y);
    return y;
}

3. Left-Right (LR) → Left rotate child, then Right rotate node

Inserted into the right subtree of the left child.

    z(+2)            z(+2)               x(0)
   /                /                   /   \
  y(-1)  L(y)      x         R(z)      y     z
   \      -->     /          -->
    x            y

First left-rotate y, converting it to an LL case, then right-rotate z.

4. Right-Left (RL) → Right rotate child, then Left rotate node

Mirror of LR: first right-rotate the right child, then left-rotate the node.

Choosing the rotation (decision rule)

After inserting key, walk back up; at the first node with |BF| = 2:
  if BF = +2 and key < node.left.key   -> LL  (rightRotate)
  if BF = +2 and key > node.left.key   -> LR  (leftRotate left child, then rightRotate)
  if BF = -2 and key > node.right.key  -> RR  (leftRotate)
  if BF = -2 and key < node.right.key  -> RL  (rightRotate right child, then leftRotate)

Complete AVL insertion program (Java)

public class AVLTree {
    static class Node {
        int key, height;
        Node left, right;
        Node(int k){ key = k; height = 1; }   // new node height = 1 (leaf)
    }
    Node root;

    int height(Node n){ return n == null ? 0 : n.height; }
    int max(int a,int b){ return a>b?a:b; }
    void updateHeight(Node n){ n.height = 1 + max(height(n.left), height(n.right)); }
    int balanceFactor(Node n){ return n == null ? 0 : height(n.left) - height(n.right); }

    Node rightRotate(Node z){
        Node y = z.left, T2 = y.right;
        y.right = z; z.left = T2;
        updateHeight(z); updateHeight(y);
        return y;
    }
    Node leftRotate(Node z){
        Node y = z.right, T2 = y.left;
        y.left = z; z.right = T2;
        updateHeight(z); updateHeight(y);
        return y;
    }

    Node insert(Node node, int key){
        // 1. normal BST insert
        if (node == null) return new Node(key);
        if (key < node.key)      node.left  = insert(node.left, key);
        else if (key > node.key) node.right = insert(node.right, key);
        else return node;                       // no duplicates

        // 2. update height of this ancestor
        updateHeight(node);

        // 3. get balance factor and rebalance
        int bf = balanceFactor(node);

        if (bf > 1 && key < node.left.key)   return rightRotate(node);           // LL
        if (bf < -1 && key > node.right.key) return leftRotate(node);            // RR
        if (bf > 1 && key > node.left.key){                                      // LR
            node.left = leftRotate(node.left);
            return rightRotate(node);
        }
        if (bf < -1 && key < node.right.key){                                    // RL
            node.right = rightRotate(node.right);
            return leftRotate(node);
        }
        return node;                            // already balanced
    }
    void insert(int key){ root = insert(root, key); }

    void preorder(Node n){ if(n==null)return; System.out.print(n.key+" "); preorder(n.left); preorder(n.right); }

    public static void main(String[] args){
        AVLTree t = new AVLTree();
        int[] keys = {10, 20, 30, 40, 50, 25};
        for (int k : keys) t.insert(k);
        System.out.print("Preorder of balanced AVL: ");
        t.preorder(t.root);     // 30 20 10 25 40 50
    }
}

Full tracing — insert `10, 20, 30, 40, 50, 25`

Insert 10, 20: balanced so far.

   10
     \
      20

Insert 30: node 10 now has BF = −2 (RR case). Left-rotate about 10:

   10(-2)                  20
     \         RR          /  \
      20       -->       10    30
        \
         30

Insert 40: goes right of 30. Tree still balanced.

Insert 50: node 30 gets BF = −2 (RR). Left-rotate about 30:

        20                       20
       /  \                     /  \
     10    30(-2)     RR      10    40
             \         -->          /  \
              40                  30    50
                \
                 50

Insert 25: goes left of 30. Now node 20 has BF = −2 and 25 < 40 (its right child path goes left first) → RL case.

        20(-2)
       /    \
     10      40(+1)
            /    \
          30      50
         /
       25

Step A — right-rotate the right child (40):

        20(-2)
       /    \
     10      30
               \
                40
               /  \
             25*   50      (* 25 attaches under 40's left after rotation arranging)

Step B — left-rotate about 20 → final balanced tree:

            30
          /    \
        20      40
       /  \       \
     10    25      50

Preorder = 30 20 10 25 40 50. ✔ matches program output. Every node now has BF in {−1,0,+1}.

AVL properties & performance

Height of an AVL tree with n nodes is always O(log n) (≈ 1.44 log₂ n in the worst case).
Search, insert, delete are all guaranteed O(log n).
Each insertion needs at most one single or double rotation; deletion may need rotations at multiple levels up to the root.
More rigidly balanced than Red-Black trees → faster lookups, but slightly more rotations on insert/delete.

6.10 The Huffman Algorithm

Huffman coding is a greedy, lossless data-compression technique invented by David Huffman (1952). It assigns shorter binary codes to frequently occurring characters and longer codes to rare ones, minimizing the total number of bits.

It produces a prefix code: no code is a prefix of another, so the encoded stream can be decoded unambiguously without separators.

The greedy idea

Count the frequency of each character.
Treat each character as a leaf node carrying its frequency.
Repeatedly take the two nodes with the smallest frequencies, merge them into a new internal node whose frequency is their sum, and put it back.
Continue until a single tree (the Huffman tree) remains.
Assign 0 to every left edge and 1 to every right edge; the code of a character is the path from root to its leaf.

Algorithm (pseudocode)

buildHuffman(chars[], freq[]):
    create a min-priority-queue PQ of leaf nodes (key = frequency)
    while PQ has more than one node:
        left  = PQ.extractMin()
        right = PQ.extractMin()
        merged = new Node('$', left.freq + right.freq)   // internal node
        merged.left  = left
        merged.right = right
        PQ.insert(merged)
    root = PQ.extractMin()
    generateCodes(root, "")        // left += "0", right += "1"

Worked example — string `"BCAADDDCCACACAC"`

Step 1 — frequencies:

A : 5
B : 1
C : 6
D : 3
Total characters = 15

Step 2 — build the tree (always merge two smallest):

Initial leaves (sorted):  B(1)  D(3)  A(5)  C(6)

Merge B(1)+D(3) = N1(4):
        N1(4)
        /   \
      B(1)  D(3)
Queue now: N1(4)  A(5)  C(6)

Merge N1(4)+A(5) = N2(9):
            N2(9)
           /     \
        N1(4)    A(5)
        /   \
      B(1)  D(3)
Queue now: C(6)  N2(9)

Merge C(6)+N2(9) = ROOT(15):
                 ROOT(15)
                /        \
             C(6)        N2(9)
                        /     \
                     N1(4)    A(5)
                     /   \
                   B(1)  D(3)

Step 3 — assign codes (left=0, right=1):

                 (15)
              0 /    \ 1
            C(6)     (9)
                  0 /   \ 1
                (4)      A(5)
              0 /  \ 1
            B(1)   D(3)

Char	Freq	Code	Bits
C	6	`0`	1
A	5	`11`	2
B	1	`100`	3
D	3	`101`	3

Step 4 — total cost:

Bits = 6*1 + 5*2 + 1*3 + 3*3 = 6 + 10 + 3 + 9 = 28 bits.
Fixed-length code (2 bits/char for 4 symbols) = 15 * 2 = 30 bits... 
With 4 symbols actually need ceil(log2 4)=2 bits -> 30 bits.
ASCII (8 bits/char) = 15 * 8 = 120 bits.
Huffman = 28 bits  =>  big saving over ASCII.

Decoding "100" → walk from root: 1→right(9), 0→left(4), 0→left → B. Prefix property guarantees a unique decode.

Java implementation

import java.util.*;

public class HuffmanCoding {
    static class Node {
        char ch; int freq;
        Node left, right;
        Node(char ch, int freq){ this.ch = ch; this.freq = freq; }
        Node(int freq, Node l, Node r){ this.ch='$'; this.freq=freq; left=l; right=r; }
    }

    static void generateCodes(Node node, String code, Map<Character,String> map){
        if (node == null) return;
        if (node.left == null && node.right == null) {   // leaf
            map.put(node.ch, code.isEmpty() ? "0" : code);
            return;
        }
        generateCodes(node.left,  code + "0", map);
        generateCodes(node.right, code + "1", map);
    }

    public static void main(String[] args){
        String text = "BCAADDDCCACACAC";

        // 1. count frequencies
        Map<Character,Integer> freq = new HashMap<>();
        for (char c : text.toCharArray())
            freq.merge(c, 1, Integer::sum);

        // 2. min-heap by frequency
        PriorityQueue<Node> pq = new PriorityQueue<>((a,b) -> a.freq - b.freq);
        for (var e : freq.entrySet())
            pq.add(new Node(e.getKey(), e.getValue()));

        // 3. build tree
        while (pq.size() > 1){
            Node left = pq.poll();
            Node right = pq.poll();
            pq.add(new Node(left.freq + right.freq, left, right));
        }
        Node root = pq.poll();

        // 4. generate + print codes
        Map<Character,String> codes = new HashMap<>();
        generateCodes(root, "", codes);

        System.out.println("Char | Freq | Code");
        for (var e : freq.entrySet())
            System.out.println("  " + e.getKey() + "  |  " + e.getValue() + "   | " + codes.get(e.getKey()));

        // 5. encoded length
        int bits = 0;
        for (char c : text.toCharArray()) bits += codes.get(c).length();
        System.out.println("Encoded length = " + bits + " bits (ASCII would be " + text.length()*8 + ")");
    }
}

Complexity: O(n log n) where n is the number of distinct characters (each heap operation is log n).

Applications: ZIP/GZIP, JPEG, MP3, PNG, and as the entropy-coding stage of many compression formats.

6.11 B-Tree

A B-Tree is a self-balancing search tree generalised so that a node can hold many keys and have many children. It is designed for systems that read and write large blocks of data — databases and file systems — where minimizing disk accesses matters more than CPU comparisons.

While a BST/AVL node has 1 key and ≤2 children, a B-Tree node packs many keys, so the tree stays very short (shallow) even for millions of records → fewer disk reads.

Properties of a B-Tree of order `m`

A B-Tree of order m (max children = m) satisfies:

Every node has at most m children and at most m − 1 keys.
Every non-root internal node has at least ⌈m/2⌉ children (i.e. at least ⌈m/2⌉ − 1 keys).
The root has at least 1 key (at least 2 children unless it is a leaf).
Keys inside a node are kept in sorted order.
A node with k keys has exactly k + 1 children.
All leaf nodes are at the same level → the tree is always height-balanced.

Keys act as separators: child i holds all keys between key[i−1] and key[i].

 Order-3 (2-3 tree) B-Tree, keys sorted, leaves on one level:

                 [ 17 | 35 ]
              /      |       \
        [ 8 12 ]  [ 22 30 ]  [ 40 52 ]

Keys < 17 go to the first child; between 17 and 35 to the second; > 35 to the third.

Insertion (with node splitting)

Insertion always happens at a leaf. If the leaf overflows (gets m keys), it splits: the median key moves up to the parent, and the node divides into two.

Algorithm:

insert(key):
    find the leaf where key belongs (descend like a BST using ranges)
    insert key into that leaf in sorted position
    if leaf now has m keys (overflow):
        split: median key goes up to parent; left/right halves become two nodes
        if parent overflows too, split recursively up (root may split -> tree grows taller)

Worked example — build an order-3 B-Tree by inserting `10, 20, 5, 6, 12, 30, 7, 17`

Order 3 ⇒ max 2 keys per node; a 3rd key forces a split (median goes up).

Insert 10:        [10]

Insert 20:        [10 20]

Insert 5:  -> [5 10 20] overflow! median=10 goes up, split:
                   [10]
                  /    \
                [5]    [20]

Insert 6:  6 < 10 -> left leaf -> [5 6]
                   [10]
                  /    \
                [5 6]  [20]

Insert 12: 12 between 10,? -> right leaf -> [12 20]
                   [10]
                  /    \
                [5 6]  [12 20]

Insert 30: right leaf -> [12 20 30] overflow! median=20 up:
                   [10 20]
                 /   |    \
              [5 6] [12]  [30]

Insert 7:  7 between 5,6? -> 7>6 -> [5 6 7] overflow! median=6 up to root:
                   [6 10 20]              overflow root! (3 keys, order 3)
                 /  |    |   \
              [5] [7] [12]  [30]
   Root has 3 keys -> split, median=10 becomes new root:
                     [10]
                    /    \
                [6]       [20]
               /   \     /    \
             [5]   [7] [12]   [30]

Insert 17: 17 between 10,20 -> child [12] -> [12 17]
                     [10]
                    /    \
                [6]       [20]
               /   \     /    \
             [5]   [7] [12 17] [30]

Notice all leaves remain on the same level at every step — the hallmark of a B-Tree.

Searching a B-Tree

search(node, key):
    i = 0
    while i < node.numKeys and key > node.keys[i]: i++
    if i < node.numKeys and key == node.keys[i]: return found
    if node is leaf: return not found
    return search(node.child[i], key)     // descend into correct child

Within a node, a linear or binary scan finds the slot; then we descend into exactly one child. Height is O(log_m n), so searches need very few disk reads.

Deletion (overview)

Deletion is the mirror of insertion and must avoid underflow (a node dropping below ⌈m/2⌉ − 1 keys):

From a leaf with enough keys: just remove it.
Underflow: borrow a key from a sibling (rotation through the parent) if the sibling has spare keys; otherwise merge with a sibling and pull a separator key down from the parent (may cascade upward and shrink the tree's height).
From an internal node: replace the key with its inorder predecessor/successor (from a leaf), then fix any resulting leaf underflow.

B-Tree vs Binary Search Tree

Aspect	BST / AVL	B-Tree
Keys per node	1	up to m−1 (many)
Children per node	≤ 2	up to m
Height for n keys	O(log₂ n)	O(log_m n) — much shorter
Designed for	In-memory	Disk / database blocks
Balance	AVL via rotations	Always balanced via split/merge

B+ Tree (note)

A common variant: in a B+ Tree all actual data records live in the leaves, internal nodes hold only keys for routing, and the leaves are linked together in a list for fast range scans. This is what most relational-database indexes actually use.

Quick Revision & Exam Pointers

Must-remember formulas

Max nodes at level L = 2^L; max nodes in height-h tree = 2^(h+1) − 1.
Strictly binary with n leaves → 2n − 1 total nodes.
Extended tree: n internal nodes → n + 1 external nodes; E = I + 2n.
Leaf/degree relation: n₀ = n₂ + 1.
Array rep (root at 1): left = 2i, right = 2i+1, parent = i/2.

Traversals on A(B(D,E),C(,F(G)))

Pre = A B D E C F G, In = D B E A C G F, Post = D E B G F C A.
Inorder of a BST = sorted order (most-asked fact).

BST deletion cases

Leaf → remove. One child → replace with child. Two children → replace with inorder successor (min of right subtree), then delete that successor.

AVL

Balance Factor = h(left) − h(right) ∈ {−1, 0, +1}.
LL → right rotate; RR → left rotate; LR → left then right; RL → right then left.

Huffman

Greedy: repeatedly merge two smallest frequencies; left = 0, right = 1; prefix code; minimizes total bits.

B-Tree

Order m: ≤ m children, ≤ m−1 keys, ≥ ⌈m/2⌉ children (non-root); all leaves at the same level; insert splits (median up), delete borrows/merges.

Complexity cheat-sheet

Structure	Search	Insert	Delete
BST (avg)	O(log n)	O(log n)	O(log n)
BST (worst, skewed)	O(n)	O(n)	O(n)
AVL	O(log n)	O(log n)	O(log n)
B-Tree	O(log_m n)	O(log_m n)	O(log_m n)

End of Unit VI — Trees.

Part - 6 Tree

Unit VI: Trees

Data Structures and Algorithms — Complete Chapter Notes (Java)

Table of Contents

6.1 Introduction

Why do we need trees?

Formal definition

6.2 Tree Terminology

6.3 Binary Tree

Properties of a binary tree

▪ Strictly Binary Tree (Full / Proper Binary Tree)

▪ Complete Binary Tree

▪ Extended Binary Tree (2-Tree)

6.4 Binary Tree Representation

▪ Array Representation of a Binary Tree

▪ Linked List Representation of a Binary Tree

6.5 Create a Binary Tree

Method 1 — Manual construction

Method 2 — Build by inserting into a queue (level-order build)

6.6 Traversal of a Binary Tree

▪ Preorder Traversal (Root, Left, Right)

▪ Inorder Traversal (Left, Root, Right)

▪ Postorder Traversal (Left, Right, Root)

▪ Level-order Traversal (Breadth-First) — bonus

Summary of all four on the sample tree

Iterative inorder (using an explicit stack)

Reconstructing a tree from traversals (classic exam question)

6.7 Binary Search Tree (BST)

▪ Insertion

▪ Search

▪ Deletion

Complete runnable BST program

6.8 Tree Height, Level and Depth

6.9 Balanced Tree: AVL Balanced Trees

The problem with plain BSTs

What is an AVL Tree?

The four rotation cases

1. Left-Left (LL) → single Right rotation

2. Right-Right (RR) → single Left rotation

3. Left-Right (LR) → Left rotate child, then Right rotate node

4. Right-Left (RL) → Right rotate child, then Left rotate node

Choosing the rotation (decision rule)

Complete AVL insertion program (Java)

Full tracing — insert 10, 20, 30, 40, 50, 25

AVL properties & performance

6.10 The Huffman Algorithm

The greedy idea

Algorithm (pseudocode)

Worked example — string "BCAADDDCCACACAC"

Java implementation

6.11 B-Tree

Properties of a B-Tree of order m

Insertion (with node splitting)

Worked example — build an order-3 B-Tree by inserting 10, 20, 5, 6, 12, 30, 7, 17

Searching a B-Tree

Deletion (overview)

B-Tree vs Binary Search Tree

B+ Tree (note)

Quick Revision & Exam Pointers

Full tracing — insert `10, 20, 30, 40, 50, 25`

Worked example — string `"BCAADDDCCACACAC"`

Properties of a B-Tree of order `m`

Worked example — build an order-3 B-Tree by inserting `10, 20, 5, 6, 12, 30, 7, 17`