Merge Sort is one of the most efficient and widely used sorting algorithms. It follows the divide-and-conquer approach, which means it breaks down a problem into smaller subproblems, solves them, and then combines the results. Merge Sort is particularly useful because it guarantees a time complexity of O(n log n)
in all cases, making it a reliable choice for sorting large datasets.
In this article, we will briefly touch upon how Merge Sort works, go through it’s implementation in python and understand the implementation.
How Merge Sort Works
Merge Sort works by recursively dividing the input array into smaller subarrays until each subarray contains only one element. Then, it keeps merging these subarrays in a sorted order. Here’s a step-by-step breakdown:
- Divide: Split the array into two halves.
- Conquer: Recursively sort each half.
- Combine: Merge the two sorted halves into a single sorted array.
Refer to these articles for better understanding on how merge sort algorithm works:
- Introduction to Merge Sort Algorithm
- Merging two sorted arrays into a single sorted array
- How Merge Sort Works: Step-by-Step Explanation
Implementation of Merge Sort Algorithm in Python
Here’s the implementation of Merge Sort Algorithm in Python programming language:
def merge_sort(arr, left, right):
"""
Recursively sorts the array using the merge sort algorithm.
Parameters:
arr (list): The array to be sorted.
left (int): The starting index of the array segment to be sorted.
right (int): The ending index of the array segment to be sorted.
"""
if left < right:
# Find the middle point to divide the array into two halves
mid = (left + right) // 2
# Recursively sort the left half of the array
merge_sort(arr, left, mid)
# Recursively sort the right half of the array
merge_sort(arr, mid + 1, right)
# Merge the two sorted halves
merge(arr, left, mid, right)
def merge(arr, left, mid, right):
"""
Merges two sorted subarrays into a single sorted array.
Parameters:
arr (list): The array containing the two subarrays to be merged.
left (int): The starting index of the first subarray.
mid (int): The ending index of the first subarray.
right (int): The ending index of the second subarray.
"""
# Calculate the sizes of the two subarrays
n1 = mid - left + 1 # Size of the left subarray
n2 = right - mid # Size of the right subarray
# Create temporary arrays to hold the two subarrays
L = arr[left : mid + 1] # Left subarray
R = arr[mid + 1 : right + 1] # Right subarray
# Initialize indices for the subarrays and the main array
i = 0 # Index for the left subarray (L)
j = 0 # Index for the right subarray (R)
k = left # Index for the main array (arr)
# Merge the two subarrays back into the main array
while i < n1 and j < n2:
if L[i] <= R[j]:
# If the current element in L is smaller, add it to the main array
arr[k] = L[i]
i += 1
else:
# If the current element in R is smaller, add it to the main array
arr[k] = R[j]
j += 1
k += 1
# Copy any remaining elements from the left subarray (if any)
while i < n1:
arr[k] = L[i]
i += 1
k += 1
# Copy any remaining elements from the right subarray (if any)
while j < n2:
arr[k] = R[j]
j += 1
k += 1
# Example run for array [4, 1, 2, 3]
if __name__ == "__main__":
# Define the array to be sorted
arr = [4, 1, 2, 3]
# Print the original array
print("Original array:", arr)
# Call the merge_sort function to sort the array
merge_sort(arr, 0, len(arr) - 1)
# Print the sorted array
print("Sorted array:", arr)
Explanation of the Implementation
Let’s break down the merge sort implementation step by step to understand how it works.
The merge_sort
Function
The merge_sort
function is the main function that performs the sorting. It takes three parameters:
arr
: The array to be sorted.left
: The starting index of the segment of the array to be sorted.right
: The ending index of the segment of the array to be sorted.
Here’s how it works:
- Base Condition: The function first checks if
left < right
. This condition ensures that the array has more than one element. Ifleft
is not less thanright
, it means the array has either one or zero elements, and no sorting is needed. - Divide the Array: If the array has more than one element, the function calculates the middle index (
mid
) using the formula(left + right) // 2
. This divides the array into two halves. - Recursive Calls: The function then calls itself recursively to sort the left half (
merge_sort(arr, left, mid)
) and the right half (merge_sort(arr, mid + 1, right)
). This process continues until the array is divided into single-element subarrays. - Merge the Sorted Halves: Once the two halves are sorted, the
merge
function is called to merge them back into a single sorted array.
The merge
Function
The merge
function is responsible for merging two sorted subarrays into a single sorted array. It takes four parameters:
arr
: The array containing the two subarrays to be merged.left
: The starting index of the first subarray.mid
: The ending index of the first subarray.right
: The ending index of the second subarray.
Here’s how it works:
-
Calculate Subarray Sizes: The sizes of the two subarrays are calculated using
n1 = mid - left + 1
(size of the left subarray) andn2 = right - mid
(size of the right subarray). -
Create Temporary Arrays: Two temporary arrays,
L
andR
, are created to hold the elements of the left and right subarrays, respectively. -
Merge Process: The function then merges the two subarrays back into the main array (
arr
). It uses three indices:i
: Index for the left subarray (L
).j
: Index for the right subarray (R
).k
: Index for the main array (arr
).
The function compares the elements of
L
andR
one by one. The smaller element is placed into the main array, and the corresponding index (i
orj
) is incremented. This process continues until all elements from eitherL
orR
are placed into the main array. -
Copy Remaining Elements: If there are any remaining elements in
L
orR
, they are copied directly into the main array.
Example Run
Let’s walk through an example with the array [4, 1, 2, 3]
:
- The
merge_sort
function is called withleft = 0
andright = 3
. - The array is divided into two halves:
[4, 1]
and[2, 3]
. - Each half is further divided and sorted recursively:
[4, 1]
becomes[4]
and[1]
, which are merged into[1, 4]
.[2, 3]
becomes[2]
and[3]
, which are merged into[2, 3]
.
- Finally, the two sorted halves
[1, 4]
and[2, 3]
are merged into the final sorted array[1, 2, 3, 4]
.