Programmers have designed many algorithms for sorting numbers, because one needs this functionality
frequently. One sorting algorithm is called the insertion sort, and it works in a manner similar to a card player
organizing his hand. Each time the algorithm reads a number (card), it places the number in its sorted position
among the numbers (cards) it has already sorted.
On the next page we show the pseudocode for the insertion sort. In this case, we use two variables,
number_index and sorted_index, to keep track of two positions in the list of numbers.
We consider the list as two sets of numbers. We start with only one set of numbers—the numbers we want
to sort. However, immediately the algorithm considers the list to be comprised of two sets of numbers; the first
“set” consists of the first number in the original list, and the second set consists of all the rest of the numbers.
The first set is the set of “sorted” numbers (like the cards already sorted in your hand), and the second set is the
remaining set of unsorted numbers. The sorted set of numbers starts out containing only a single number, but as the
algorithm proceeds, more and more of the unsorted numbers will be moved to their proper position in the sorted set.
The variable number_index keeps track of where we are in the list of unsorted numbers; it starts at 2,
the first number which is “unsorted.” The variable sorted_index keeps track of where we are among the
sorted numbers; it starts at 1, since the first element of the original list starts the set of “sorted” numbers.
The algorithm compares the next number to be inserted into the sorted set against the largest of the sorted
numbers. If the new number is smaller, then the algorithm shifts all the numbers up one position in the list. This
repeats, until eventually the algorithm will find that the new number is greater than the next sorted number, and
the algorithm will put the new number in the proper position next to the smaller number.
It’s also possible that the new number is smaller than all of the numbers in the sorted set. The algorithm
will know that has happened when sorted_index becomes 0. In that case, the algorithm inserts the new
number as the first element in the sorted set.
Insertion_Sort(num_list)
length <-- length of num_list
// At the start, the second element of the original list
// is the first number in the set of "unsorted" numbers.
number_index <-- 2
// We’re done when we have looked at all positions in the list.
while(number_index <= length) {
// newNum is the no. being considered for sorting
newNum <-- num_list[number_index]
// sorted_index marks the end of previously sorted numbers.
sorted_index <-- number_index - 1
// From high to low, look for the place for the new number.
// If newNum is smaller than the previously sorted numbers,
// move the previously sorted numbers up in the num_list.
while newNum < num_list[sorted_index] AND sorted_index > 0 {
num_list[sorted_index + 1] <-- num_list[sorted_index]
sorted_index <-- sorted_index - 1
}
// newNum is not smaller than the number at sorted_index.
// We found the place for the new number, so insert it.
num_list[sorted_index + 1] = newNum
}
end
To repeat, the variable number_index keeps track of where the algorithm is in the unsorted set of numbers.
The algorithm starts with the second number (number_index = 2). Then the algorithm compares the number
to the largest number that has been sorted so far, num_list[sorted_index]. If the number is smaller than the
previously sorted number, the algorithm moves the previously sorted number up one position in num_list, and
checks the new number against the next largest number in the previously sorted elements of num_list. Finally,
the algorithm will encounter a previously sorted number which is smaller than the number being inserted, or it will
find itself past the starting position of num_list. At that point, the number can be inserted into the num_list.
The algorithm completes when all of the positions in the num_list have been sorted.
To analyze the running time of the insertion sort, we note first that the performance will be proportional to
n, the number of elements to be sorted. We also note that each element to be sorted must be compared one or
many times with the elements already sorted. In the best case, the elements will be sorted already, and each element
will require only a single comparison, so the best-case performance of the insertion sort is Θ(n).
In the worst case, the elements to be sorted will be in reverse order, so that every element will require comparison
with every element already sorted. The second number will be compared with the first, the third with the second
and first, the fourth with the third, second, and first, etc. If there were four numbers in reverse order, the number of
comparisons would be six. In general, the number of comparisons in the worst case for the insertion sort will be:
n2/2 - n/2
The number of comparisons will grow as the square of the number of elements to be sorted. The negative
term of -n/2, and the division of n2 by the constant 2, mean that the rate of growth in number of comparisons
will not be the full rate that n2 would imply. However, for very large values of n, those terms other than n2 become relatively insignificant. Imagine the worst case of sorting a million numbers. The n2 term will
overwhelm the other terms of the equation.
Since one usually reports the order of growth for an algorithm as the worst-case order of growth, the insertion
sort has a theta of n2, or Θ(n2). If one computes the average case order of growth for the insertion sort, one also
finds a quadratic equation; it’s just somewhat smaller, since on average each new element will be compared with
only half of the elements already sorted. So we say the performance of the insertion sort is Θ(n2).
No comments:
Post a Comment