 |
Binary Search
|
-
MOTIVATION: Storing objects in Vectors:
-
Most frequently first: hopefully only have to access first n elements
-
Arbitrarily: have to look at all elements, or.... need a new approach...
that is, keep our Vectors sorted!
-
The REAL WORLD! E.g., Searching for a name in a phone book or word in a
dictionary, guessing a number.
-
So, we know how to search efficiently in the real world, how about
using Vectors? Let's assume that we have a Vector of Strings, which
is already in alphabetic order
-
Informally:
-
Suppose we are looking for a String s
-
Keep track of two position left and right
-
Look at String in middle between left and right
-
If string at middle comes before s, make new value for left
the middle position
-
If string at middle comes after s, make new value for right
the middle position
-
Repeat until we've narrowed the search down to single position.
- This example, which can be run in Microsoft Excel, shows an animation of the binary search.
-
Formally (in Java):
-
Recall we have a Vector of sorted Strings
Vector words = ...
-
Vectors are indexed from 0 to (size - 1)
-
Define two variables to keep track of left and right:
// String can be in positions left to right - 1
// where left must always be less than right
int left; // String cannot come before left
int right; // String cannot come after right - 1
Keep searching until left == right - 1
while (left != right -1 ) {
...
}
So when search terminates, check String at position left to see
if we have the string we were searching for
String testString = (String) messages.elementAt (left);
if (s.Equals (testString))
return left;
else
return - 1;
Need to initialize left and right:
left = 0;
right = messages.size();
Note: the String class defines a compareTo method:
int compareTo (String s)
- if < 0, then String object is less than String argument
- if > 0, then String object is greater than String argument
- if == 0, then String s is equal to String object
Look at midway point:
// look in "middle"
m = (left + right) / 2;
String compareString = (String)messages.elementAt(middle);
if (compareString.compareTo (s) < 0)
left = middle;
else if (compareString.compareTo (s) > 0)
right = middle;
else {
left = middle;
right = middle + 1;
So, putting it all together:
int binarySearch (String s) {
int left = 0;
int right = messages.size();
int middle;
while (left != right - 1) {
// look in "middle"
middle = (left + right) / 2;
String compareString = (String)messages.elementAt(middle);
if (compareString.compareTo (s) < 0)
left = middle;
else if (compareString.compareTo (s) > 0)
right = middle;
else {
left = middle;
right = middle + 1;
}
String testString = (String) messages.elementAt (left);
if (s.Equals (testString))
return left;
else
return - 1;
}
What about an empty Vector, i.e., messages.size() == 0? Precede
all code with:
// Empty Vector so return - 1
if (messages.size() == 0)
return (-1);
Why binary search? Suppose searching Vector with 1000 elements?
Keeping cutting in half:
-
left == 0, right == 1000, middle = 500
-
left == 500, right == 1000, middle = 750
-
left == 500, right == 750, middle = 625
-
left == 500, right == 625, middle = ...
Efficiency: Supposed we needed to search a Vector of size 1000 for a word.
Could use:
-
Sequential search (as we did initially)
-
Binary search (as we have just described)
Which is "better"? Well if Vector is sorted:
-
Sequential search: Have to look at each element until found. For a 1000
element Vector, on average 500 searches.
-
Binary search: Well how many times can you divide 1000 by 2? 1000, 500,
250, 125, 62, 31, 15, 7, 3, 1 = 10 times
But, binary search assumes sorted Vector. If not sorted then:
-
Search sequentially, or
-
Sort the Vector first, or
-
Keep Vector sorted. Stay tuned.
So what's the point? Good algorithm design can have a big impact on
performance! For example, compare performance of sequential versus
binary:
| N (number of elements) |
log2 (N) |
N/2 |
(N/2)/log2(N) (RATIO) |
| 1000 |
10 |
500 |
50 |
| 5000 |
13 |
2500 |
192 |
| 10000 |
14 |
5000 |
357 |
| 1000000 |
20 |
500000 |
25000 |
Last modified: 4/11/99