Wednesday, March 20, 2013

Heap Sort with C++

In my attempt to improve my programming skills, I just read about the heap sort algorithm.
Here are my notes and code where I try to understand and implement the heapsort algorithm in C++.

Step one of the heapsort is to build a valid heap.  When someone says heap sort, we are looking at a binary heap.  A binary heap is a binary tree where all levels of a heap are filled (except possibly the last level) and where every parent is greater than either of its children.

For example, if you have an array or vector with these values:
[ 6,  8, 1 , 5, 7, 3,  2... ]
A binary heap would have this shape:
          6
      8        1
    5   7    3   2
...
and we would say 6 is the parent of 8 and 1, 8 is the parent of 5 and 7, etc ...  However, while this binary heap has the right shape, it fails to meet the criteria where every parent is greater than or equal to either of its children.

In order to transform an invalid heap to a valid one, I wrote the functions heapify() and shiftDown():
void heapify( vector & v, int start, int end )  {
    int midPos = ( end - start ) / 2;
    for ( int pos = midPos; pos >= 0; pos-- ) {
        shiftDown( v, pos, end );
    }
}
void shiftDown( vector & v, int parent_pos, int end_pos ) {

    // Sift the parent down the right child?
    int right_child_pos = ( parent_pos * 2 ) + 2;
    if ( right_child_pos < end_pos &&
         v[ parent_pos ] < v[ right_child_pos ] ) {
         int tmp = v[ parent_pos ];
         v[ parent_pos ] = v[ right_child_pos ];
         v[ right_child_pos ] = tmp;
         shiftDown( v, right_child_pos, end_pos );
     }

    // Sift the parent down the left child?
    int left_child_pos = ( parent_pos * 2 ) + 1;
    if ( left_child_pos < end_pos &&
         v[ parent_pos ] < v[ left_child_pos ] ) {
         int tmp = v[ parent_pos ];
         v[ parent_pos ] = v[ left_child_pos ];
         v[ left_child_pos ] = tmp;
         shiftDown( v, left_child_pos, end_pos );
     }

}
The heapify()  function walks from the parent that is on the lowest level of the heap (sits at n/2) to the parent at the top of the heap (position = 0).
For each parent, we use shiftDown() to shift values down the tree until the rule where any parent is greater than or equal to any child is fulfilled.

Given the above functions, we have a partially sorted list where the larger values are at the top of the heap, which is the start of the array.  To get a sorted array where the values go from smallest to largest, we cherry pick the top of the heap to get the max value then rebuild the heap, as follows:
void heapSort( vector & v )  {
    int numItems = v.size();
    heapify( v, 0, numItems - 1);

    for ( int end_pos = numItems - 1; end_pos >= 0; end_pos-- ) {
        int tmp = v[ 0 ]; 
        v[ 0 ] = v[ end_pos ];
        v[ end_pos ] = tmp;

        heapify( v, 0, end_pos);
    }
}

For the sake of testing I also wrote this function to print the array to look like a heap.
void print_as_heap( const vector & v ) {
    int max_row_size = 1;
    int row_size = 0;
    vector::const_iterator it ;
    for ( it = v.begin(); it != v.end(); it ++ ) {
        std::cout << " " << *it;
        row_size++;

        if ( row_size == max_row_size ) {
            std::cout << std::endl;
            row_size = 0;
            max_row_size *= 2;
        }
    }
    std::cout << std::endl;
}
The end.

No comments:

Post a Comment