Earlier we discussed the sequential search algorithm and found its performance to be Θ(n). One can search
much more efficiently if one knows the list is in order to start with. The improvement in efficiency is akin to the
improved usefulness of a telephone book when the entries are sorted by alphabetical order. In fact, for most
communities, a telephone book where the entries were not sorted alphabetically would be unthinkably inefficient!
If the list to be searched is already ordered from smallest to largest, the binary search algorithm can find
any entry in (lg n) time. If the list contains 1,000,000 entries, that means the binary search will locate the
item after reading fewer than 20 entries. The sequential search, on average, will have to read 500,000 entries.
What a difference!
The binary search works by repetitively dividing the list in half. It starts by comparing the element in the
middle of the list with the item sought. If the search item is smaller than the element in the middle of the list,
the binary search reads the element at the middle of the first half of the list. Then, if the search item is larger
than that element, the binary search next reads the element at the middle of the second half of the front half of
the list. Eventually, the search finds the element sought, or concludes that the element is not present in the list.
Here is pseudocode for a binary search:
BinarySearch(list, search_item)
begin <-- 1
end <-- length of list
match_found <-- false
// Repeat search as long as no match has been found
// and we have not searched the entire list.
while match_found = false AND begin <= end
// Find the item at the midpoint of the list
midpoint <-- (begin + end) / 2
// If it’s the one we’re looking for, we’re done
if list[midpoint] = search_item
match_found = true
// If the search item is smaller, the next
// list item to check is in the first half
else if search_item < list[midpoint]
end <-- midpoint - 1
// Otherwise, the next list item to check
// is in the back half of the list
else
begin <-- midpoint + 1
// Return true or false, depending on whether we
// found the search_item
return match_found
With each iteration, the binary search reduces the size of the list to be searched by a factor of 2. So, the
binary search generally will find the search item, or conclude that the search item is not in the list, when the
algorithm has executed (lg n) iterations or fewer. If there are seven items in the list, the algorithm will complete
in three iterations or fewer. If there are 1,000,000 items in the list, the algorithm will complete in 20 iterations
or fewer.
If the original list happens to be a perfect power of 2, the maximum number of iterations of the binary
search can be 1 larger than (lg n). When the size of the list is a perfect power of 2, there are two items at the (lg n)
level, so one more iteration may be necessary in that circumstance. For instance, if there are eight items in the
list, the algorithm will complete in (3 + 1) iterations or fewer.
In any case, the running time of the binary search is Θ(lg n). This efficiency recommends it as a search
algorithm, and also, therefore, often justifies the work of keeping frequently searched lists in order.
No comments:
Post a Comment