Binary Search

Binary search is an efficient algorithm for finding a specific element in a sorted array or list. The key principle behind binary search is to divide the search interval in half with each step, significantly reducing the time it takes to find the target element compared to linear search, especially in large datasets. Here's a step-by-step explanation using an example with 10 elements:

  1. Array: First, we need a sorted array. Let's say our array is [1, 3, 4, 7, 9, 11, 12, 14, 17, 19], and we want to search for the number 11.

  2. Initial Variables: We start with two pointers, usually named low and high. Initially, low is set to the first index (0) and high is set to the last index (9 in this case).

  3. Middle Element: We find the middle element of the array. The middle index, mid, is calculated as (low + high) / 2. In our case, it's (0 + 9) / 2 = 4.5, which we round down to 4. The middle element is the one at index 4, which is 9.

  4. Comparison: We compare the middle element with the target value (11). Since 9 is less than 11, we discard the left half of the array including the middle element. Our new search space is [11, 12, 14, 17, 19].

  5. Update Pointers: We update low to be mid + 1 (which is now 5) and keep high as it is (9).

  6. Repeat Process: We repeat the process - find the middle of the new range, compare it with the target, and decide which half to keep. The new middle element is at index (5 + 9) / 2 = 7, which is 14.

  7. Comparison Again: Since 14 is greater than 11, we discard the right half of the current array. Our search space is now reduced to just [11].

  8. Final Step: We repeat the process. The middle of our new range [11] is 11, which matches our target.

  9. Conclusion: Since we've found the target, we conclude the search. If at any point our low exceeds high, it means the element isn't in the array.

Let's take another example of an array with 20 elements. Consider the sorted array:

[2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40]

Suppose we want to search for the number 26.

Initial Array:
[2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40]
 ^                                ^                                       ^
low                              mid                                     high

Step 1: Since 26 > 20 (element at mid), discard left half including mid
[22, 24, 26, 28, 30, 32, 34, 36, 38, 40]
 ^                ^                  ^
low              mid                high

Step 2: Since 26 < 30 (element at mid), discard right half including mid
[22, 24, 26, 28]
 ^    ^       ^
low  mid     high

Step 3: Since 26 > 24 (element at mid), discard left half including mid
[26, 28]
  ^
 mid

Finally, we find 26 at mid.

Binary search halves the search space at each step, significantly reducing the number of comparisons needed to find the target or determine its absence. This makes binary search very efficient, especially for large datasets.

Let's write the pseudo-code for binary search and explain each step. The algorithm is applied on a sorted array.

Pseudo Code

function binarySearch(array, target)
    low = 0
    high = length(array) - 1

    while low <= high
        mid = (low + high) / 2

        if array[mid] == target
            return mid  // Target found

        if array[mid] < target
            low = mid + 1  // Search in the right half

        else
            high = mid - 1  // Search in the left half

    return -1  // Target not found

Explanation

  1. Initialize Variables:

    • low and high are pointers to track the current search interval. Initially, low is set to the start of the array (index 0), and high is set to the end of the array (index length(array) - 1).
  2. Binary Search Loop:

    • The while loop continues as long as low is less than or equal to high, meaning there's still a range to search.
  3. Calculate Midpoint:

    • The midpoint mid is calculated as the average of low and high. This is the index we check against the target.
  4. Check Midpoint:

    • If array[mid] is equal to the target, the function returns mid. This means the target has been found.

    • If array[mid] is less than the target, the target must be in the right half of the array. So, we set low to mid + 1.

    • If array[mid] is greater than the target, the target must be in the left half. So, we set high to mid - 1.

  5. Target Not Found:

    • If the loop exits without returning, it means the target is not in the array. We return -1 or some other indicator of failure.

This algorithm is efficient for large datasets because it halves the search space with each step, having a time complexity of O(log n), where n is the number of elements in the array. However, it's crucial that the array is sorted before applying binary search.

Below is an example of a binary search algorithm implemented in C.

#include <stdio.h>

int binarySearch(int arr[], int size, int target) {
    int low = 0;
    int high = size - 1;

    while (low <= high) {
        int mid = low + (high - low) / 2;

        if (arr[mid] == target) {
            return mid;  // Target found
        }

        if (arr[mid] < target) {
            low = mid + 1;  // Search in the right half
        } else {
            high = mid - 1;  // Search in the left half
        }
    }

    return -1;  // Target not found
}

int main() {
    int arr[] = {2, 3, 4, 10, 14, 23, 31, 45, 56, 65};
    int size = sizeof(arr) / sizeof(arr[0]);
    int target = 23;
    int result = binarySearch(arr, size, target);

    if (result != -1) {
        printf("Element is found at index %d\n", result);
    } else {
        printf("Element not found in the array\n");
    }

    return 0;
}

Explanation

  1. Include Directive: #include <stdio.h> is used to include the Standard Input Output header file, which is required for using the printf function.

  2. The binarySearch Function:

    • It takes three parameters: the array arr, its size, and the target element to search for.

    • Variables low and high are initialized to mark the beginning and end of the array.

    • A while loop runs as long as low is less than or equal to high.

    • Inside the loop, mid is calculated as the average of low and high. To avoid potential overflow, it's calculated as low + (high - low) / 2 instead of (low + high) / 2.

    • If arr[mid] is equal to the target, the function returns the index mid.

    • If arr[mid] is less than the target, the search continues in the right half by setting low to mid + 1.

    • If arr[mid] is greater than the target, the search continues in the left half by setting high to mid - 1.

    • If the target is not found, the function returns -1.

  3. The main Function:

    • Defines an array arr and its size size.

    • Specifies the target element to search for.

    • Calls the binarySearch function and stores the result in result.

    • Prints the index where the element is found or a message if it's not found.

  4. Return Statement: The main function returns 0, indicating successful program termination.

Complexity analysis of the binary search algorithm involves examining both its time complexity and space complexity:

Time Complexity

  1. Best Case: O(1)

    • The best-case scenario occurs when the target element is at the midpoint of the array during the first iteration. In this case, only one comparison is needed, and the algorithm terminates immediately.
  2. Average and Worst Case: O(log n)

    • In each step of binary search, the array is effectively halved. This means that the time to search the array grows logarithmically with the size of the array.

    • Mathematically, at most (\log_2 n + 1) comparisons are needed, where ( n ) is the number of elements in the array. This logarithmic growth makes binary search very efficient for large datasets.

Space Complexity

  1. Iterative Approach (as in the provided C code): O(1)

    • The iterative version of binary search, like the one you have in the C code, has a constant space complexity because it uses a fixed amount of space (for variables low, high, mid, etc.) regardless of the input size.

    • There are no additional data structures used that grow with input size.

  2. Recursive Approach: O(log n)

    • If binary search is implemented using recursion, the space complexity can become O(log n) due to the use of the call stack. Each recursive call adds a layer to the stack, and there can be up to (\log_2 n) calls in the worst case.

Binary search is highly efficient in terms of time complexity, especially compared to linear search, which has a time complexity of O(n). The space efficiency is also excellent in the iterative approach, making it a preferred method for searching in sorted arrays or lists.