Table of Contents
hide
Sorting
Introduction of Sorting
- It puts items in a list into a specific order for further use.
Definition of Sorting
- Sorting is the process/technique of re-arrangement of data items (stored inside the memory) in some specific order(such as either in ascending order or in descending order or in alphabetical order or any other user-defined order) so that retrieval of information becomes easier.
Characteristics of Sorting
- Sorting algorithms can be characterized in the following two ways:-
(i) Simple algorithms uses the order of n2 (written as O(n2))comparisons to sort n items.
(ii) Sophisticated algorithms uses the O(nlog2n) comparisons to sort n items.
- The performance of a sorting algorithm/type can also depend on the degree of order already present in the data.
Types of Sorting
- There are so many sorting algorithms available to sort the data. These sorting algorithms vary in their way of sorting mechanism. Different environments require different sorting methods.
- There are several categories of sorting(Comparison and Non-Comparision Sorting, In-Place and Not-in-place Sorting, Adaptive and Non-adaptive Sorting, and Stable and Non-Stable Sorting ) but two basic categories of sorting methods: – Internal Sorting and External Sorting.
(A.) Comparison-based and Non-Comparision-based Sorting
(a)Comparision-based Sorting :
-
-
- Comparison-based sorting algorithms are a class of sorting algorithms that sort elements in an array list by comparing them pairwise based on some defined comparison operator (usually less than or greater than).
- Comparison-based sorting algorithms are widely used due to their simplicity and effectiveness.
- The most well-known sorting algorithms, including some of the fundamental ones, fall into this category such as Bubble sort, Selection sort, Insertion sort, Quick sort, Merge sort, Heap sort etc.
-
(b)Non-Comparision-based Sorting :
-
-
- Non-comparison-based sorting algorithms, also known as linear-time or linearithmic-time sorting algorithms, are a class of sorting algorithms that do not rely on pairwise comparisons between elements to determine their relative order.
- This sorting works using certain properties of the data, such as integer values or specific patterns, to sort the elements more efficiently than comparison-based algorithms.
- They can achieve better time complexity than comparison-based algorithms in some cases, particularly when the range of input values is limited or when the distribution of data exhibits certain patterns.
- This algorithm is not as versatile as comparison-based algorithms, as their applicability can be restricted by the nature of the data.
- For example – Counting sort, Bucket sort, Radix sort, etc.
-
(B.) Adaptive and Non-Adaptive Sorting
(a)Adaptive Sorting :
-
-
- When a sorting algorithm takes advantage of already ‘sorted’ elements present in the list that is to be sorted.
- It takes advantage of some elements present that are already sorted and do not re-order them.
-
(b)Non-Adaptive Sorting :
-
-
- A non-adaptive algorithm does not take into account the elements which are already sorted.
- They try to force every element to be re-ordered to confirm their sorting process.
-
(C.) In-Place and Not-in-Place Sorting
(a)In-Place Sorting :
-
-
- In-place sorting algorithms there is no requirement of any extra space for comparison and temporary storage of a few data elements during the sorting process say for example, within the array itself.
- Bubble sort is an example of in-place sorting.
-
(b) Not-In-Place Sorting :
-
-
- In Not-In-Place sorting algorithms, the program requires some extra space that is more than or equal to the elements being sorted.
- Merge-sort is an example of not-in-place sorting.
-
(D.) Stable and Non-Stable Sorting
(a)Stable Sorting :
-
-
- A Stable sorting algorithm, after sorting the contents/values of lists, does not change the sequence(after/before location) of similar values/contents(not different values) in which they appear, it is called stable sorting.
-
(b)Non-Stable Sorting :
-
-
- A Non-Stable sorting algorithm, after sorting the contents/values of lists, changes the sequence(after/before location) of similar values/contents(not different values) in which they appear, it is called non-stable sorting.
-
(E.) Internal and External Sorting
(a) Internal Sorting :
-
-
- When sorting is applied the entire collection of data is to be sorted in a small enough space inside the Primary or Main memory or RAM. It is called Internal Sorting.
- There is a limitation for internal sorts i.e. they can only process relatively small lists due to memory constraints.
- The time required to read or write is not considered to be significant in evaluating the performance of internal sorting methods.
- These algorithms work on the principle of comparing elements in the input array to determine their relative order called the Comparision-based sorting algorithm (eg – Insertion, Bubble, Selection, Quick, Heap, Tree, Merge sort), and Non-Comparison Sorting Algorithms(Sorts elements without directly comparing them, often leveraging properties of the input data. eg- Counting, Radix, Bucket sort)
- Examples of Internal sorting are :
- Insertion sort
- Bubble sort
- Selection sort
- Quick sort
- Heap sort
- Tree sort
- Merge sort
- Counting sort
- Radix sort
- Bucket sort
- Shell sort
-
(a.) Insertion Sort
-
-
-
- This is a type of Internal sort.
- This is a naturally occurring sorting method.
- Insertion Sort is efficient for small datasets or nearly sorted data but becomes inefficient for large datasets
- In this sorting, at every step, we insert an item into its proper place in an already-ordered list. Here, insertion sort does not exchange elements rather the element is inserted at an appropriate sorted place. Here the list is divided into two parts sorted and unsorted sub-lists. In each step, the first element of the unsorted sub-list is picked up and moved into the sorted sub-list by inserting it in a suitable position. When we have ‘n’ elements then we need n-1 steps to sort the elements.
- Best Case Time Complexity (O(n)) i.e. the best case occurs when the array is already sorted.
- Average Case Time Complexity (O(n²)) i.e. this occurs when the elements are in random order.
- Worst Case Time Complexity (O(n²)) i.e. the worst case occurs when the array is sorted in reverse order.
-
-
(b.) Bubble Sort
-
-
-
- Bubble sort is a type of Internal Sort.
- Bubble sort is a simple, comparison-based sorting algorithm that repeatedly steps through the list to be sorted, compares adjacent elements, and swaps them if they are in the wrong order. The pass/iteration through the list is repeated until no swaps are needed, and the list is sorted.
- The Bubble sort performs a sorting operation using the exchange/swap of elements.
- Bubble sort gets its name because it filters out or sorts the elements at the top of the array like bubbles on water.
- Bubble sorting is one of the easiest ways of sorting techniques.
- We can sort both numeric as well as string values using it.
- It is very simple to implement.
- Bubble Sort repeatedly steps through the array list, compares adjacent elements, and swaps them if they are in the wrong order. The process is repeated until the entire array is sorted.
- While conceptually simple, Bubble Sort is inefficient for larger datasets.
- It gives quite accurate results.
- It is the slowest algorithm.
- The best time complexity of a bubble sort can be O(n). O(n) is only possible if the array is sorted.
- Bubble Sort is an in-place sorting algorithm, meaning it requires only a constant amount of additional memory beyond the input array (for a few variables like the loop counters and a temporary variable for swapping).
- Algorithms of Bubble Sort:
-
-
Step 1: Start with the first element (with index 0) of the list.
Step 2: Compare the current element with the next adjacent element (index 1 or more).
-
-
-
-
-
- If the current element value is greater than the next element, swap them(when sorted in ascending order and vice-versa in descending order).
- If not, move/skip to the next pair of elements.
-
-
-
-
Step 3: Continue comparing and swapping adjacent elements until we reach the end of the list. Finally, the largest element in the unsorted portion “bubbles up” to the end of the list in one pass or iteration.
Step 4: Repeat steps 1-3 for the remaining unsorted portion of the list, excluding the sorted last element, which is already in its correct position after the first pass or iteration.
Step 5: Continue these passes through the list, each time one less element to compare, until no more swaps are needed in a pass/iteration which is the indication that the entire list is sorted.
-
-
-
- Best Case Time Complexity (O(n)) i.e. the best case occurs when the array is already sorted.
- Average Case Time Complexity (O(n²)) i.e. this occurs when the elements are in random order.
- Worst Case Time Complexity (O(n²)) i.e. the worst case occurs when the array is sorted in reverse order.
-
-
(c.) Selection Sort
-
-
-
- This is a type of Internal sort.
- The selection sort performs a sorting operation using the exchange of elements.
- Selection sort is a simple sorting algorithm.
- Selection Sort is a simple comparison-based sorting algorithm. It works by repeatedly selecting the smallest (or largest, depending on sorting order) element from the unsorted part of the array and swapping it with the first unsorted element.
- It sorts an array list by repeatedly selecting the minimum element from the unsorted portion of the array and moving it to the beginning of the sorted portion.
- This sorting algorithm is an ‘in-place comparison’ based algorithm or working principle in which the list is divided into two parts, the sorted part at the left end and the unsorted part at the right end. Initially sorted part is empty and the unsorted part is the entire list. Here, the smallest element is selected from the unsorted array and swapped with the leftmost element and that element becomes part of the sorted array. This process continues moving unsorted array boundary by one element to the right. In other words, Selection Sort is an in-place sorting algorithm meaning it only requires a constant amount of extra space, regardless of the input size. It only uses a small, fixed amount of space for temporary variables (such as for swapping elements).
- It also performs a lot of unnecessary swaps. However, it has the advantage of being simple to understand and easy to implement.
- This algorithm is not suitable for large data sets or lists because its average and worst-case time complexity is O(n2) which is not effective where n are no. of items.
- Algorithms of Selection Sort:
-
-
Step 1: Start with the first element of the array list (as the minimum value).
Step 2: Iterate through the rest of the array list to find the minimum element in the unsorted portion of the array by comparing each element with the current minimum value.
Step 3: If a smaller element is found, update/swap the minimum value to be that element otherwise skip it.
Step 4: Increment the index of the sorted portion and repeat steps 1-4 until the entire array is sorted.
-
-
-
- Best Case Time Complexity = (O(n²)) i.e. it occurs only when the array is considered as already sorted, even if the Selection Sort still performs the same number of comparisons.
- Average Case Time Complexity = (O(n²)) i.e. it occurs only when the array is in a random order.
- Worst Case Time Complexity = (O(n²)) i.e. the worst case occurs when the array is sorted in reverse order.
-
-
(d.) Quick Sort
-
-
-
- Quick sorts are based on the Divide and Conquer concept. The divide and conquer strategy solves a problem by :
(i) Breaking the main problems into several sub-problems that are themselves smaller instances of the original problem.
(ii) Recursively solving these subproblems.
(iii) Appropriately combining these solved subproblems. - Quick Sort is a divide-and-conquer sorting algorithm. It selects a “pivot” element, partitions the array into two subarrays (one with elements smaller than the pivot and the other with elements larger than the pivot), and then recursively sorts the subarrays.
- Quick Sort is efficient on average with a time complexity of O(n log n), making it one of the fastest sorting algorithms in practice for large datasets. However, in the worst case, Quick Sort degrades to O(n²), which occurs when the pivot is poorly chosen and the array becomes unbalanced.
- The time complexity of Quick Sort depends on how well the pivot divides the array and how balanced the partitioning is.
- Best Case Time Complexity = (O(n log n)) i.e. it occurs only when the pivot element divides the array into two nearly equal halves at each step of the recursion.
- Average Case Time Complexity = (O(n log n)) i.e. it occurs only when the pivot elements do not perfectly divide the array into equal halves, but the partitioning is reasonably balanced on average.
- Worst Case Time Complexity = (O(n²)) i.e. the worst case occurs when the pivot element is always the smallest or largest element, resulting in the most unbalanced partitions. This happens when the pivot consistently divides the array into one large subarray and one very small subarray (e.g., when the pivot is always the first or last element, and the array is already sorted or nearly sorted).
- Quick sorts are based on the Divide and Conquer concept. The divide and conquer strategy solves a problem by :
-
-
(e.) Heap Sort
-
-
-
- Heap Sort is a popular comparison-based sorting algorithm that creates the structure of a binary heap (specifically, a max-heap or min-heap) to efficiently sort an array of elements.
- It is known for its optimal worst-case and average-case time complexity of O(n log n), making it suitable for sorting large datasets.
- Heap Sort works by transforming the input array into a binary heap repeatedly extracting the root element (maximum or minimum, depending on the type of heap) and placing it at the end of the sorted portion of the array.
- Heap sort is completed in two steps – (i) Heap Creation and, (ii) Heap sorting.
-
Heap Sort has the following characteristics:-
- Time Complexity: The worst-case, average-case, and best-case time complexity of Heap Sort are all O(n log n), making it efficient for sorting large datasets. The reason all cases have the same time complexity is that Heap Sort performs the same operations regardless of the initial order of the input data.
- In-Place Sorting: Heap Sort requires only a constant amount of additional memory space for its operations, so its space complexity is O(1), as it requires only a constant amount of additional memory, making it an in-place sorting algorithm.
-
Unstable Sort: Heap Sort is generally unstable, meaning that it might change the relative order of equal elements in the input array.
-
Not Adaptive: The time complexity of Heap Sort is not affected by whether the input is partially sorted or not. It performs equally well regardless of the initial order of elements.
- Heap Sort is widely used in various applications due to its predictable time complexity and in-place nature.
-
-
(b)External Sorting :
-
-
- External sorting is a method that is applied to a larger collection of data items that cannot fit entirely in the main memory (RAM) of a computer but reside on Secondary or External devices such as Hard disks to perform sorting operations efficiently.
- All external sorts are based on the process/principle of merging i.e. different parts of data are sorted separately and merged together.
- Read and write access times are a major concern in determining the sorting performances of such methods.
- Examples are:- External Merge Sort, Polyphase Merge Sort, Replacement Selection Sort, External Quick/Bubble/Insertion Sort, Tape Sort, and K-way Merge Sort.
-
Merge Sort/Two-Way Merge Sort
- Merge sort is based on the principle of “divide and conquer” principle i.e., divides the array into smaller subarrays, sorts them, and then merges them back together.
- Two-Way Merge Sort, also known as Merge Sort, is a classic comparison-based sorting algorithm that follows the divide-and-conquer paradigm.
- It is a sorting algorithm that follows the divide-and-conquer approach to sort an array of elements by splitting it into smaller subarrays, sorting those subarrays, and then merging them back together. Two-Way Merge Sort differs from the standard Merge Sort in how it merges two subarrays.
- It works by recursively dividing the input array into smaller subarrays, sorting those subarrays, and then merging them back together to obtain the final sorted array.
- Merge Sort has a time complexity of O(n log n), making it efficient for sorting large datasets.
- Two-Way Merge Sort’s consistent time complexity and efficient performance make it a popular choice for sorting applications where stability and predictable behavior are important.
-
The working mechanism of the Two-Way Merge Sort algorithm is completed in the following steps:-
-
Divide: The input array is divided into two halves, roughly equal in size. This process continues recursively until each subarray contains only one element (which is already sorted by definition).
-
Conquer: The individual subarrays are sorted by comparing elements within each subarray. Since each subarray contains only one element, they are already considered sorted at this point.
-
Merge: The sorted subarrays are merged back together in a sorted manner. This involves comparing the elements of both subarrays and placing them in order in a new merged array. The process continues until all elements from both subarrays are merged into a single sorted array.
-
- Thus, in brief, the basic working mechanism of merge sort is to divide the list into two smaller sub-lists of approximately equal size. Recursively repeat this procedure till only one element is left in the sub-list. After this, various sorted sub-lists are merged to form a sorted parent list. This process goes on recursively till the original sorted list arrives. Thus, the merge sort is completed in three major steps – Divide part, Conquer part, and finally Combine part.
- Time Complexity of Merge Sort –
- Best Case: O(n log n)
- Average Case: O(n log n)
- Worst Case: O(n log n)
The time complexity is the same for all cases because Merge Sort always divides the array and merges it in the same way, regardless of the initial order of the elements.
- Merge Sort is not in-place, as it requires additional memory for temporary arrays used during the merge process. The space complexity is O(n) due to these temporary arrays.
-
Types of Merge Sort:
-
Top-Down Merge Sort:
-
The standard form of Merge Sort is often referred to as Top-Down Merge Sort.
-
It follows the divide-and-conquer approach by recursively dividing the input array into smaller subarrays, sorting them, and then merging them.
-
This version of Merge Sort is characterized by its clarity and ease of implementation.
-
-
Bottom-Up Merge Sort:
-
Bottom-up Merge Sort is a variation that starts by treating each element as a sorted subarray and then repeatedly merges adjacent subarrays to form larger sorted subarrays.
-
This version avoids the recursive approach of the top-down variant and is sometimes considered more efficient in practice due to better cache utilization and reduced function call overhead.
-
-
Two-Way Merge Sort:
-
Two-way merge Sort is a simplified version of Merge Sort that performs the merging step by comparing elements from two subarrays directly and placing the smaller elements into a temporary array.
-
This is in contrast to the standard Merge Sort, which uses an auxiliary array to store merged subarrays.
-
Two-way merge Sort is straightforward to implement and is used in cases where memory usage is a concern.
-
-
In-Place Merge Sort:
- In-place Merge sort algorithms do not require any extra space for comparison and temporary storage of a few data elements during the sorting process.
-
In-Place Merge Sort is an advanced variant that aims to minimize memory usage by performing the merging step in place, without requiring additional memory for temporary arrays.
-
It employs techniques like swapping and shifting elements to achieve this.
-
In-Place Merge Sort is challenging to implement correctly and efficiently due to the complex nature of in-place merging.
-
K-Way Merge Sort:
-
K-Way Merge Sort generalizes the concept of merging by allowing the algorithm to merge more than two subarrays at a time.
-
Instead of merging pairs of subarrays, K-Way Merge Sort merges multiple subarrays using a priority queue or a heap data structure.
-
This variant can improve efficiency when dealing with a large number of subarrays.
-
-
-
Efficiency of Sorting Techniques
- The efficiency of the Sorting technique is to get the amount of time required to sort an array of ‘n’ elements by a particular method.
- There may be any of three ways to measure the efficiency of sorting techniques:-
- Best case
- Average case
- Worst case
Advantages of Sorting
- Sorting creates the specific sequence of the list of items which helps in fast & quick searching.
- It helps to analyze the data to get the desired filtered data.
- It helps to optimize the efficiency of other algorithms.
Disadvantages of Sorting
- The sorting process may take a long time, especially when it is a large list.
0 Comments