Another algorithm for sorting numbers uses recursion, a technique we will discuss in more detail shortly,
to divide the problem into many smaller problems before recombining the elements of the full solution. First,
this solution requires a routine to combine two sets of sorted numbers into a single set.
Imagine two piles of playing cards, each sorted from smallest to largest, with the cards face up in two piles,
and the two smallest cards showing. The merge routine compares the two cards that are showing, and places the
smaller card face down in what will be the merged pile. Then the routine compares the two cards showing after
the first has been put face down on the merged pile. Again, the routine picks up the smaller card, and puts it
face down on the merged pile. The merge routine continues in this manner until all the cards have been moved
into the sorted merged pile.
Here is pseudocode for the merge routine. It expects to work on two previously sorted lists of numbers, and
it merges the two lists into one sorted list, which it returns. The variable index keeps track of where it is working
in sorted_list.
The routine compares the first (top) numbers in the two original lists, and puts the smaller of the two into
sorted_list. Then it discards the number from the original list, which means that the number that used to
be the second one in the original list becomes the first number in that list. Again the routine compares the first
numbers in the two lists, and again it moves the smaller to sorted_list.
The routine continues this way until one of the original lists becomes empty. At that point, it adds the remaining
numbers (which were in sorted order originally, remember) to sorted_list, and returns sorted_list.
merge(list_A, list_B)
// index keeps track of where we are in the
// sorted list
index <-- 1
// Repeat as long as there are numbers in both
// original lists.
while list_A is not empty AND list_B is not empty
// Compare the 1st elements of the 2 lists.
// Move the smaller to the sorted list.
// "<" means "smaller than."
if list_A[1] < list_B[1]
sorted_list[index] <-- list_A[1]
discard list_A[1]
else
sorted_list[index] <-- list_B[1]
discard list_B[1]
index <-- index + 1
// If numbers remain only in list_A, move those
// to the sorted list
while list_A is not empty
sorted_list[index] <-- list_A[1]
discard list_A[1]
index <-- index + 1
// If numbers remain only in list_B, move those
// to the sorted list
while list_B is not empty
sorted_list[index] <-- list_B[1]
discard list_B[1]
index <-- index + 1
// Return the sorted list
return sorted_list
The performance of merge is related to the lengths of the lists on which it operates, the total number of
items being merged. The real work of the routine is in moving the appropriate elements of the original lists into
the sorted list. Since the total number of such moves is equal to the sum of the numbers in the two lists, merge
has a theta of nA + nB, or Θ(nA + nB), where nA + nB is equal to the sum of the numbers in the two lists.
The merge_sort will use the merge routine, but first the merge_sort will divide the problem up into
smaller and smaller sorting tasks. Then merge_sort will reassemble the small sorted lists into one fully sorted list.
In fact, merge_sort divides the list of numbers until each sublist consists of a single number, which can be
considered a sorted list of length 1. Then the merge_sort uses the merge procedure to join the sorted sublists.
The technique used by merge_sort to divide the problem into subproblems is called recursion. The
merge_sort repeatedly calls itself until the recursion “bottoms out” with lists whose lengths are one. Then
the recursion “returns,” reassembling the numbers in sorted order as it does. Here is pseudocode for the merge
sort. It takes the list of numbers to be sorted, and it returns a sorted list of those numbers.
merge_sort(num_list)
length <-- length of num_list
// if there is more than 1 number in the list,
if length > 1
// divide the list into two lists half as long
shorter_list_A <-- first half of num_list
shorter_list_B <-- second half of num_list
// Perform a merge sort on each shorter list
result_A <-- merge_sort(shorter_list_A)
result_B <-- merge_sort(shorter_list_B)
// Merge the results of the two sorted sublists
sorted_list <-- merge(result_A, result_B)
// Return the sorted list
return sorted_list
else
// If there’s only 1 number in the list,
// just return it
return num_list
end
Let’s follow the execution of merge_sort when one calls it with this list of numbers:
NUMS = { 1, 6, 4, 2 }
1 First, we call merge_sort passing the list NUMS. This is what we call the “top-level” of recursion,
level 0.
2 merge_sort calls merge_sort again, passing a list of the first two numbers in NUMS. This will
sort the front half of the list. This is level 1 of recursion.
3 Now merge_sort calls merge_sort again, passing only the first number in NUMS. This is level 2.
4 Now merge_sort simply returns; it’s down to one element in the list, merge_sort returns to level 1.
5 Now merge_sort calls merge_sort again, passing only the second of the first two numbers in
NUMS. This is level 2.
6 Again, merge_sort simply returns; it’s down to one element in the list, merge_sort returns to level 1.
7 At level 1 of recursion, merge_sort now has result_A and result_B. merge_sort calls
merge to put those two numbers in order, and then it returns the sorted pair of numbers back to level 0.
The first half of the list is sorted.
8 From level 0, merge_sort calls merge_sort again, passing a list of the last two numbers in NUMS.
This will sort the back half of NUMS. It’s back to level 1 of recursion.
9 merge_sort calls merge_sort again, passing only the first of the last two numbers of NUMS. This is
level 2 of recursion again.
10 Since the list contains only one number, merge_sort simply returns back to level 1.
11 merge_sort calls merge_sort again, passing only the last of the numbers of NUMS. This is level 2
of recursion again.
12 Since the list contains only one number, merge_sort simply returns back to level 1.
13 At level 1 of recursion, merge_sort now has result_A and result_B. merge_sort calls
merge to put the two lists in order, and then it returns the sorted set of two numbers back to level 0.
14 At level 0 of recursion, merge_sort now has result_A and result_B. merge_sort calls merge
to put the two lists of numbers in order, and then it returns the entire set of four numbers in sorted order.
Aside from being an interesting exercise in recursion, the merge_sort provides attractive performance. The
merge sort has a theta of n(lg n), which for large problems is much better than the theta of n2 for the insertion sort.
The recursion in merge_sort divides the problem into many subproblems by repeatedly halving the size
of the list to be sorted. The number of times the list must be divided by two in order to create lists of length one
is equal to the logarithm to the base 2 of the number of elements in the list.
In the case of our 4-element example, the logarithm to the base 2 of 4 is 2, because 22 = 4. This can be written
as log2 n, but in computer science, because of the ubiquity of binary math, this is usually written as lg n, meaning
logarithm to the base 2 of n.
The total running time T of the merge sort consists of the time to recursively solve two problems of half
the size, and then to combine the results. One way of expressing the time required is this:
T = 2T(n/2) + merge
Since merge runs in Θ(nA + nB), and since nA + nB = n, we will restate this:
T = 2T(n/2) + Θ(n)
A recursion tree is a way to visualize the time required. At the top level, we have the time required for
merge Θ(n), plus the time required for the two subproblems:
Θ(n)
T(n/2) T(n/2)
At the next level, we have the time required for the two merges of the two subproblems, and for the further
subdivision of the two subproblems:
Θ(n)
Θ(n/2) Θ(n/2)
T(n/4) T(n/4) T(n/4) T(n/4)
We can continue this sort of expansion until the tree is deep enough for the size of the overall problem:
Θ(n)
Θ(n/2) Θ(n/2)
Θ(n/4) Θ(n/4) Θ(n/4) Θ(n/4)
...
...
Adding across each row, we find:
Sum
Θ(n) Θ(n)
Θ(n/2) Θ(n/2) Θ(n)
Θ(n/4) Θ(n/4) Θ(n/4) Θ(n/4) Θ(n)
...
...
For any particular problem, because we repetitively divide the problem in two, we will have as many levels
as (lg n). For instance, our example with four numbers had only two levels of recursion. A problem with eight
numbers will have three levels, and a problem with 16 numbers will have four.
Summing over the whole problem, then, we find the merge sort has a theta of n(lg n). There are (lg n) levels,
each with a theta of n. So the merge sort has an order of growth of Θ(n(lg n)).
This is a very big deal, because for large sets of numbers, n(lg n) is very much smaller than n2. Suppose
that one million numbers must be sorted. The insertion sort will require on the order of (106)2, or
1,000,000,000,000 units of time, while the merge sort will require on the order of 106 (lg 106), or 106 (20),
or 20,000,000 units of time. The merge sort will be almost five orders of magnitude faster. If a unit of time is
one millionth of a second, the merge sort will complete in 20 seconds, and the insertion sort will require a week
and a half!
No comments:
Post a Comment