Topic 14 - Sorting
Sorting is not a data structure.
Data structures and sorting algorithms are linked.
In this topic we will make some assumptions:
Each algorithm is interchangeable.
Each will receive an array of elements.
Each array positions contain data to be sorted.
The arrays are of size \(N\).
Each element of the array have the operators \(<,>,==\) implemented.
Insertion Sort
Insertion Sort is one of the simplest sorting algorithms.
Algorithm
The idea:
Works in \(N-1\) passes.
For pass \(p=1\) through \(N-1\):
Move \(p\) to the right to have the elements from 0 to \(p\) sorted.
Based on the fact that elements from 0 to \(p-1\) are already sorted.
Example
Below an example of the process:
Implementation
We can do a very quick implementation:
1template <typename Comparable>
2void insertionSort( vector<Comparable> & a )
3{
4 for(int p = 1; p < a.size( ); ++p)
5 {
6 Comparable tmp = a[p];
7
8 int j;
9 for(j = p; j > 0 && tmp < a[j-1]; --j)
10 {
11 a[j] = a[ j - 1 ];
12 }
13 a[j] = tmp;
14 }
15}
Activity
Apply this code to the previous example.
STL Implementation
The STL library is not using a vector.
It is using a pair of iterator, which represents the start and the end of a range.
/*
* The two-parameter version calls the three-parameter version,
* using C++11 decltype
*/
template <typename Iterator>
void insertionSort( const Iterator & begin, const Iterator & end )
{
insertionSort( begin, end, less<decltype(*begin)>{ } );
}
template <typename Iterator, typename Comparator>
void insertionSort( const Iterator & begin, const Iterator & end, Comparator lessThan )
{
if( begin == end )
return;
Iterator j;
for( Iterator p = begin+1; p != end; ++p )
{
auto tmp = std::move( *p );
for( j = p; j != begin && lessThan( tmp, *( j-1 ) ); --j )
*j = std::move( *(j-1) );
*j = std::move( tmp );
}
}
As you can see it is the same algorithm, but more general.
Analysis
Now we will calculate the running time of this algorithm.
Activity
What is the Big Oh running time of Insertion sort?
We can calculate the precise running time.
The number of tests in the inner loop is at most \(p+1\) for each \(p\).
It sums to : \(\sum_{i=2}^{N}i=2+3+4+\dots + N = \Theta(N^2)\)
Heapsort
The algorithm is based on binary heap.
The idea:
We build a binary heap of \(N\) elements.
Then, we perform \(N\) delete operations.
We put these elements in another array.
Building the heap takes \(O(N)\) time.
Each delete takes \(O(\log N)\) time.
Activity
What is the running time?
The only downside is that it double the memory used.
Activity
Sort the following array:
[31,41,59,26,53,58,97]
.Show each step.
Analysis
Building the heap uses less than \(2N\) comparisons.
The \(i\) th delete uses at most less \(2\lfloor \log (N-i+1) \rfloor\) comparisons.
For a total of \(2N\log N - O(N)\) if \(N\geq 2\) comparisons.
This is why we have \(O(N\log N)\).