Introduction to Data Structures and Aglorithms using CPP

Algorithms and Data Structures in C++: A Discrete Mathematics Perspective

In this post, we explore fundamental algorithms and data structures in C++ through the lens of discrete mathematics. We’ll implement core logic structures in C++ and use formal mathematical tools—like induction, recurrence relations, and set theory—to justify their behavior and efficiency.

Target Audience: Students and developers seeking to ground their algorithmic thinking in formal math, especially for competitive programming, software engineering, or academic study.


1. Binary Search and Proof of Correctness

C++ Implementation

int binarySearch(const std::vector<int>& arr, int target) {
    int low = 0, high = arr.size() - 1;
    while (low <= high) {
        int mid = low + (high - low) / 2;
        if (arr[mid] == target) return mid;
        else if (arr[mid] < target) low = mid + 1;
        else high = mid - 1;
    }
    return -1;
}

Loop Invariant Proof

Let’s define the loop invariant:

\[\text{At the beginning of each iteration, if the target is in the array, then it must be in } [\text{low}, \text{high}]\]

Proof Outline:

  • Initialization: Initially, low = 0 and high = n-1. All elements are in scope.
  • Maintenance: Each iteration halves the interval while preserving the invariant.
  • Termination: When low > high, the interval is empty and the element is not found.

This satisfies the requirements for partial correctness and termination, proving correctness.


2. Merge Sort and Recurrence Relation

C++ Implementation

void mergeSort(std::vector<int>& arr) {
    if (arr.size() <= 1) return;
    int mid = arr.size() / 2;
    std::vector<int> left(arr.begin(), arr.begin() + mid);
    std::vector<int> right(arr.begin() + mid, arr.end());
    mergeSort(left);
    mergeSort(right);
    std::merge(left.begin(), left.end(), right.begin(), right.end(), arr.begin());
}

Recurrence Relation and Solution

Let $begin:math:text$ T(n) $end:math:text$ be the time complexity of merge sort:

\[T(n) = 2T\left(\frac{n}{2}\right) + O(n)\]

This fits the Master Theorem where:

  • $begin:math:text$ a = 2 $end:math:text$, $begin:math:text$ b = 2 $end:math:text$, $begin:math:text$ f(n) = O(n) $end:math:text$
  • $begin:math:text$ f(n) = \Theta(n^{\log_b a}) = \Theta(n) $end:math:text$

Therefore:

\[T(n) = \Theta(n \log n)\]

3. Depth-First Search and Tree Properties

C++ Implementation

void dfs(int node, std::vector<std::vector<int>>& adj, std::vector<bool>& visited) {
    visited[node] = true;
    for (int neighbor : adj[node]) {
        if (!visited[neighbor]) dfs(neighbor, adj, visited);
    }
}

Tree Property from Graph Theory

Let \(G = (V, E)\) be an undirected graph.

Claim: If \(G\) is a tree, then:

\[|E| = |V| - 1\]

Proof by Induction:

Base case: For one node, $$ V = 1\(and\) E = 0$$.
Inductive step: Assume the property holds for \(k\) nodes. Adding one new node with exactly one edge maintains acyclicity and increases both $$ V \(and\) E $$ by 1.

Hence, DFS traversal of a tree with \(n\) nodes visits every node in \(O(n)\) time and encounters no cycles.


4. Stack and Recursion: Fibonacci Sequence

C++ Recursive Implementation

int fib(int n) {
    if (n <= 1) return n;
    return fib(n - 1) + fib(n - 2);
}

Recurrence Relation

Let $begin:math:text$ F(n) $end:math:text$ be the time complexity of the naive recursive function:

\[T(n) = T(n-1) + T(n-2) + O(1)\]

This solves to exponential time:

\[T(n) = \Theta(2^n)\]

Improved with Dynamic Programming

int fibDP(int n) {
    std::vector<int> dp(n+1);
    dp[0] = 0; dp[1] = 1;
    for (int i = 2; i <= n; ++i)
        dp[i] = dp[i-1] + dp[i-2];
    return dp[n];
}

This reduces time complexity to:

\[T(n) = \Theta(n)\]

5. Set Theory and Hash Tables

C++ Implementation

#include <unordered_set>

std::unordered_set<int> seen;
seen.insert(42);
if (seen.find(99) == seen.end()) {
    seen.insert(99);
}

Mathematical Abstraction

A hash table is a function:

\[h: K \rightarrow \{0, 1, \ldots, m - 1\}\]

Where $begin:math:text$ K $end:math:text$ is the universe of keys, and $begin:math:text$ m $end:math:text$ is the size of the table.

  • We want the hash function $begin:math:text$ h $end:math:text$ to be uniform:
\[\forall x, y \in K,\, x \ne y \Rightarrow \Pr[h(x) = h(y)] \ll 1\]

When collisions occur, we resolve them via chaining or open addressing. Lookup and insert operations are $begin:math:text$ \Theta(1) $end:math:text$ on average.


6. Big-O and Asymptotic Bounds

Discrete math enables formal definitions of complexity:

Let $begin:math:text$ f(n) = 5n^2 + 3n + 7 $end:math:text$

We say:

\[f(n) = O(n^2)\]

If:

\[\exists\, c > 0,\, \exists\, n_0 > 0 \text{ such that } \forall n \ge n_0,\, f(n) \le c \cdot n^2\]

This ensures we can compare growth rates of algorithms using asymptotic notation.


Conclusion

Discrete mathematics gives us the language to reason rigorously about algorithms. Whether it’s proving correctness via loop invariants, analyzing time via recurrence, or structuring data using sets and graphs, math turns C++ code into certifiable logic.

Stay tuned for Part II, where we model graph theory algorithms like Dijkstra’s and minimum spanning trees with proofs of optimality.


Tags: #Algorithms #C++ #DiscreteMathematics #Proofs #DataStructures #CSTheory

Introduction to Data Structures and Aglorithms using CPP
Introduction to Data Structures and Aglorithms using CPP