For each of the sample programs do the following:
Other and from the list select
Java Main Class (only need to go throuh Other once).
Class Name: LinearSearch package: miscell
import java.util.*;
miscell,
right-clicking and selecting Run File.
package miscell;
import java.util.*;
public class LinearSearch1 {
public static void main(String[] args) {
Random rand = new Random();
int size = 20;
int A[] = new int[size], key;
for (int i = 0; i < A.length; ++i) {
A[i] = rand.nextInt(size * 2);
}
key = rand.nextInt(size * 2);
System.out.println("A = " + Arrays.toString(A));
System.out.println("key = " + key);
int i;
for (i = 0; i < A.length; ++i) {
if (key == A[i]) {
break;
}
}
boolean found = (i != A.length);
System.out.println("found = " + found);
System.out.println("comparisons: " + (found ? i + 1 : size));
}
}
int binarySearch(int[] a, int key) int binarySearch(int[] a, int fromIndex, int toIndex, int key)The "toIndex" argument name is misleading since the range is really between fromIndex and toIndex 1, inclusive. Thus these two calls are equivalent:
java.util.Arrays.binarySearch(A, key); java.util.Arrays.binarySearch(A, 0, A.length, key);The int return value is supposed to be the position at which the key is found. If not found, a negative value is returned. These algorithm specifications suggest a more general way to write linear search, namely using our own package, util, and our own class MyArrays with two versions:
util.MyArrays.linearSearch(int[] a, int key) util.MyArrays.linearSearch(int[] a, int fromIndex, int toIndex, int key)We should also consider what to do if the latter of these calls is provided with invalid arguments such as these:
linearSearch(A, -3, 2, 15); linearSearch(A, 3, 2, 15);In these cases, we want to throw an Exception. A few tests of Arrays.binarySearch leads us to add this starter code:
if (fromIndex < 0 || toIndex < 0 || fromIndex > toIndex) {
throw new IllegalArgumentException();
}
In Netbeans, create the Java Class:
Class Name MyArrays package utilwith this content:
package miscell;
import java.util.*;
import util.MyArrays;
public class LinearSearch2 {
public static void main(String[] args) {
Random rand = new Random();
int size = 20;
int A[] = new int[size], key;
for (int i = 0; i < A.length; ++i) {
A[i] = rand.nextInt(size * 2);
}
key = rand.nextInt(size * 2);
System.out.println( "A = " + Arrays.toString(A) );
System.out.println( "key = " + key );
int pos = MyArrays.linearSearch(A, key);
System.out.println(
"\n" +
"linear search" + "\n" +
"return = " + pos + "\n" +
"comparisons: " + (pos >= 0 ? pos+1 : size )
);
}
}
public static <T> int linearSearch(T[] A, int fromIndex, int toIndex, T key) {
if (fromIndex < 0 || toIndex < 0 || fromIndex > toIndex) {
throw new IllegalArgumentException();
}
int i;
for (i = fromIndex; i < toIndex; ++i) {
if (key.equals(A[i])) {
return i;
}
}
return -1;
}
public static <T> int linearSearch(T[] A, T key) {
return linearSearch(A, 0, A.length, key);
}
A[i].equals(key)say, for String type, is not a "fixed" cost and likely to be more costly than comparison for primitive types.
T(n) ≤ D1 + D * nwhere D1 represent the initializations and the D factor counts the comparison plus all the variable manipulations.
T(n) = O(f(n)) or T(n) is in O(f(n))
if there are constants C and k, such that
T(n) ≤ C * f(n), for all n ≥ kThe idea of "n ≥ k" means eventually it has this behavior if we ignore some initial finite portion. For example, suppose
T(n) = 100 + nThis indicates that there is some fixed initial cost regardless of the problem size. We can say:
T(n) ≤ 101 * n, for all n ≥ 1but we can also say
T(n) ≤ 2 * n, for all n ≥ 100getting a "better" asymptotic constant, C, in the sense that it is smaller (we can always make it larger). Going back to our worst-case for linear search, write:
T(n) ≤ D1 + D * n = (D + D1/n) * nIf n ≥ D1, then T(n) ≤ (D+1) * n, and so T(n) = O(n), using the the "C" constant C = D+1.
the worst case is O(n); the best case is O(1)
So we assume that it is equally likely to find the key in each of the n positions. Using this notion, the average cost is:
(total cost for all n position) / nThe cost for finding in position i is (i+1), i = 0…n-1;, so we get
1/n * ∑i = 0,…,n-1(i+1) = 1/n * ∑i = 1,…,ni = 1/n * (n*(n+1))/2 = (n+1)/2
n, n/2, n/4, ..., 1How many terms in this sequence? Roughly,
log2nIt comes from the definition of logarithm to the base b:
logbn = x means bx = n, i.e., blogbn = nAll bases differ only by a constant:
blogb2 = 2 (blogb2)x = blogb2*x = 2x blogb2*log2n = 2log2n = nso
logbn = logb2*log2nCommon bases:
log n = log2nAs far as order measures are concerned, all bases are equal.
int binarySearch(int[] A, int fromIndex, int toIndex, int key) {
if (fromIndex == toIndex) {
return - 1;
}
int mid = (fromIndex + toIndex) / 2;
if (key == A[mid]) {
return mid;
}
// else
if (key < A[mid]) {
return binarySearch(A, fromIndex, mid, key);
}
// else // key > A[mid]
return binarySearch(A, mid+1, toIndex, key);
}
The else keywords are optional because of the return statements.
One has to keep in mind that the division used to compute mid
is integer division (i.e., truncation).
As is the case with linearSearch, the toIndex parameter is off the right
edge of the array. Our textbook makes, in my opinion, the unfortunate
decision to make the second parameter position in his array-based
algorithms the rightmost index, so beware.
The execution ofActually we should say something about what pos means when it is a negative value on a failed search, but this is sufficient for now. The proof is by induction on the array size, len = toIndex - fromIndex. It is a good idea to run a few examples by hand. Again, keep in mind that computer integer division is trunctated. In mathematical terms, division of two integers with integer result:int pos = binarySearch(A, fromIndex, toIndex, key)sets pos ≥ 0 if and only if key is present in A at position pos, wherefromIndex ≤ pos < toIndex
a/bis expressed using the "floor" function flr, which simply truncates any decimal part.
mid = (fromIndex + toIndex)/2 = (2*fromIndex + 1)/2 = fromIndexwe see if the key is at fromIndex and then either
mid = (fromIndex + toIndex)/2 = (2*fromIndex + 2*k)/2 = fromIndex + kand thus,
fromIndex < mid = fromIndex + k < fromIndex + 2*k = toIndexThe left-hand side has this many elements:
mid - fromIndex = kand the right-hand side has this many elements:
toIndex - (mid+1) = fromIndex + 2*k - (fromIndex + k + 1) = k - 1Since k ≥ 1, in both cases the values are less than len = 2 * k.
mid = (fromIndex + toIndex)/2 = (2*fromIndex + 2*k + 1)/2 = fromIndex + kand thus,
fromIndex ≤ mid = fromIndex + k < fromIndex + 2*k + 1 = toIndexThe left-hand side has this many elements:
mid - fromIndex = kand the right-hand side has this many elements:
toIndex - (mid+1) = fromIndex + 2*k + 1 - (fromIndex + k + 1) = kBoth sides have k elements which is less than len = 2 * k + 1.
n even => left side n/2, right side n/2 - 1 n odd => both sides have size (n-1)/2 = n/2Let
T(1) = 1 T(n) = 1 + T( n/2 ), n > 1Look at some computed values
T(2) = 2 T(3) = 2 T(4) = 3 T(5) = 3Claim:
T(n) ≤ 1 + log n
T(n) = 1 + T(n/2)
≤ 1 + 1 + log( n/2 ) (induction, since flr(n/2) < n)
= 1 + 1 + (log(n) - 1) (property of log)
= 1 + log(n) (simple algebra)
And so, since we can effectively ignore the "1", we say:
The average time for binary search is O(log n)
Count(h) = ∑i = 0,…,h-1 2i* (i+1)we can derive this relation for the averge time, T(n):
1/n * Count( flr(log n) ) ≤ T(n) ≤ 1/n * Count( 1 + flr(log n) )The value of Count(h) is a well-known series sum:
Count(h) = (h 1) * 2h + 1Thus, in particular, using the left-hand side of the above inequality, we can determine that:
T(n) ≥ 1/n * ( flr(log n) 1 ) * 2flr(log n) + 1Using the fact that flr(log n) > log(n) 1, a substitution gives:
T(n) ≥ 1/n * (log(n) 2) * 2(log(n)1)
= 1/n * (log(n) 2) * ½ n
= ½ log(n) 1
Thus the average time is still logarithmic with multiplicative
constant somewhere between ½ seen here and the worst case, 1,
computed above.
package miscell;
import java.util.*;
//import util.MyArrays;
public class BinarySearch {
public static void main(String[] args) {
Random rand = new Random();
int size = 20;
int A[] = new int[size], key;
for (int i = 0; i < A.length; ++i) {
A[i] = rand.nextInt(size * 2);
}
key = rand.nextInt(size * 2);
/*
for (int i = 0; i < A.length; ++i) {
A[i] = i+1;
}
key = 0;
key = 21;
*/
System.out.println("A (initial) = " + Arrays.toString(A));
Arrays.sort(A);
System.out.println("A (sorted) = " + Arrays.toString(A));
System.out.println("key = " + key);
int pos;
pos = Arrays.binarySearch(A, key);
System.out.println(
"\n" +
"binary search from Arrays class" + "\n" +
"return = " + pos + "\n" +
"found = " + (pos >= 0) + "\n"
);
}
}
return -1;However, we can provide better information which indicates the position that the key should be added. This is done by:
return - fromIndex - 1;We are insured that this is negative under all circumstances. Here's how we can use it:
int insert_pos = - ret + 1;Then if we
| 0 | 10 | 20 | 30 | 40 |
| 0 | 1 | 2 | 3 | 4 |
public static int binarySearch(int[] A, int fromIndex, int toIndex, int key) {
if (fromIndex < 0 || toIndex < 0 || fromIndex > toIndex) {
throw new IllegalArgumentException();
}
The reason is that in the recursive call stack which happens, only the first
call needs to check for invalid arguments; the other calls will never have
this issue. If we wanted to really be super-efficient, a better approach
would be to have the public binarySearch function call a private recursive
function like this:
public static int binarySearch(int[] A, int fromIndex, int toIndex, int key) {
if (fromIndex < 0 || toIndex < 0 || fromIndex > toIndex) {
throw new IllegalArgumentException();
}
rec_binSearch(A, fromIndex, toIndex, key);
}
private static int rec_binarySearch(int[] A, int fromIndex, int toIndex, int key) {
if (fromIndex == toIndex) { return - fromIndex - 1; }
++count;
int mid = (fromIndex + toIndex) / 2;
if (key == A[mid]) {
return mid;
}
if (key < A[mid]) {
return rec_binarySearch(A, fromIndex, mid, key);
}
// else key > A[mid]
return rec_binarySearch(A, mid+1, toIndex, key);
}
Although this approach is probably overkill, the separation of the
public interface function calling a private recursive helper is quite
common in practice.
MyArrays.setCount(0);
pos = MyArrays.binarySearch(A, key);
System.out.println(
"\n" +
"binary search from MyArrays class" + "\n" +
"return = " + pos + "\n" +
"comparisons: " + MyArrays.getCount()
);
public static <T> int binarySearch(T[] A, int fromIndex, int toIndex, T key) {
if (fromIndex == toIndex) {
return - fromIndex - 1;
}
++count;
int mid = (fromIndex + toIndex) / 2;
int comp = ((Comparable) key).compareTo(A[mid]);
if (comp == 0) {
return mid;
}
if (comp < 0) {
return binarySearch(A, fromIndex, mid, key);
}
// else comp > 0
return binarySearch(A, mid+1, toIndex, key);
}
public static <T> int binarySearch(T[] A, T key) {
return binarySearch(A, 0, A.length, key);
}
In particular, we must drop the normal inequality
comparison operations in preference
of the compareTo member function.
Unlike the equals function which is defined
at the Object level, the
compareTo function is only defined for
Comparable objects.
Thus we assume the array element type, T,
is Comparable by forcing a cast, getting this code:
int comp = ((Comparable) key).compareTo( A[mid] );
The value of comp determines one of the three relations:
T(1) = 1 T(n) = 2 * T(n/2) + O(n), n > 1We want to solve this relation since this is a very common one in practice for divide-and-conquer algorithms:
T(n) = 2 * T(n/2) + f(n)where there exist constants C1 and k such that:
f(n) ≤ C1 * n, for all n ≥ kIt is relatively easy to argue that there is a possibly larger constant, C, such that:
f(n) ≤ C * n, for all n ≥ 1Simply choose C ≥ C1 so that this inequality holds for the (finite) list of values n = 1,…,k-1. So we can now write:
T(1) = 1 T(n) ≤ 2 * T(n/2) + C*n, n > 1Let's compare T(n) and n * log n for the first few values of n > 1
| T(n) | n * log n | |
T(2) ≤ 2*T(1) + C = 2+C*2 = (C+1) * 2 T(3) ≤ 2*T(1) + C*3 = 2+3*C < (C+1) * 3 T(4) ≤ 2*T(2) + C*4 = 2*(2*(C+1)) + C*4 < (C+1) * 8 |
2 = 2 * log 2 3 < 3 * log 3 8 = 4 * log 4 |
T(n) ≤ (C+1) * n * log n, n > 1We've verified that this works up to 4, so assume that n > 4 and the inequality holds for all values up to (but not including) n. The induction goes like this:
T(n) ≤ 2 * T(n/2) + C*n
≤ 2 * (C+1) * n * log(n/2) + C*n
= 2 * (C+1) * n/2 * ( log(n) 1 ) + C*n
= (C+1) * n * log(n) (C+1) * n + C*n
= (C+1) * n * log(n) n
< (C+1) * n * log(n)
and the induction is proved.
O(1)
O(log n)
O(n)
O(n * log n)
O(n2)
O(n2 * log n)
O(bn) (each b generates a separate order class)
These order classes are upwardly inclusive, i.e.,
if T(n) = O(log n), then of course, T(n) = O(n). We're usually
interested in the "best fit" in the sense of finding the smallest order
class to which T(n) belongs.
In order to characterize the "best fit" of an order class, we need
two other notions:
T(n) = Ω(f(n)) or T(n) is in Ω(f(n))
if there are constants C and k, such that
T(n) ≥ C * f(n), for all n ≥ k
T(n) = Θ(f(n)) or T(n) is in Θ(f(n))
if T(n) = O(f(n))
and
T(n) = Ω(f(n))
This means that there are constants
C1, C2 and k, such that
C1 * f(n) ≤ T(n) ≤ C2 * f(n), for all n ≥ k
T(n) = 100 * n + 200 * n2 + n3We can igore all but the highest power term. In doing so, we might say:
100 * n + 200 * n2 = o(n3)The latter (exact match) is actually quite important, but unfortunately not explicitly defined in the Weiss textbook although the author uses is it on a number of occasions. For example, on page 5, he expresses these well-known mathematical facts:
N*(N+1)/2 ≈ ½ N2 ∑i=1…N 1/i ≈ ln(N)Authors are particularly keen on knowing the actual constants relative to the number of operations performed. For example, we can say that for linear search, counting the number of comparisons:
linear search average case for successful search (n) = (n+1)/2 ≈ ½ n
limx → ∞ log2(x)/x = limx → ∞log2(e) * ln(x)/x = limx → ∞ log2(e) * ln′(x) / x′ = limx → ∞ log2(e) * 1/x / 1 = 0and so
log(n) = o(n)
n = 2 * n/2, if n even n = 2 * n/2 + 1, if n oddThroughout this section, for convenience, we'll represent integer truncated division by simple division, i.e.,
n/2 is really flr(n/2)Using properties of exponents, we have:
x0 = 1 x1 = xand then we can write either:
n even: xn = (x2)n/2 n odd: xn = x * (x2)n/2
n even: xn = (xn/2)2 n odd: xn = x * (xn/2)2
public static double powA(double x, int n) {
if (n == 0)
return 1;
if (n == 1)
return x;
if (n % 2 == 0)
return powA( x * x, n/2 );
//else
return x * powA( x * x, n/2 );
}
public static double powB(double x, int n) {
if (n == 0)
return 1;
if (n == 1)
return x;
double val = powB(x, n/2);
if (n % 2 == 0)
return val * val;
//else
return x * val * val;
}
public static void main(String[] args) throws Exception {
double x = 2.0;
int n = 10;
System.out.println(
"powA(" + x + "," + n + ") = " + powA(x,n)
+ "\n" +
"powB(" + x + "," + n + ") = " + powB(x,n)
);
}
The proof of correctness is more-or-less obvious for the recursive
functions powA and powB.
If we count each multiplication as "1", then,
for the worst-case time function T(n), we compute:
T(0) = 0 T(1) = 0 T(n) = T(n/2) + 2, n ≥ 2It can be proved by induction that:
T(n) = O( log(n) )Note that we were careful not to write the powB function like this:
if (n % 2 == 0) return powB(x, n/2) * powB(x, n/2); else return x * powB(x, n/2) * powB(x, n/2);The reason is that it would make the recurrence relation this:
T(n) = 2 * T(n) + 2and the 2 * T(n) factor causes T(n) to enter a different order class. It can be proved that
T(n) = O( n * log(n) )
public static double powC(double x, int n) {
double val = 1;
while(n != 0) {
if (n % 2 == 1) val *= x;
n = n/2;
x = x*x;
}
return val;
}
In particular, an iterative algorithm uses
variables which change state in order to control the iteration.
A proof requires that you
establish a loop invariant, which is a logical statement
expressed using the program variables so that:
b = x; p = n;
double val = 1;
while(n != 0) {
if (n % 2 == 1) val *= x;
n = n/2;
x = x*x;
}
The variables
b (base) and p (power)
represent the initial values of n and x,
respectively. We want to argue that after the loop terminates:
val = bpThe additional annotated variables b (base) and p are needed to express the invariant because x and n change state. The invariant we want is this:
val * xn = bpClearly it is true initially. Assume it is true up to a certain point and then consider the next iteration step. Let val′, n′, and x′ be the new values of val, n and x, respectively, then:
x′ = x * x
n′ = n/2
val′ = val * x, if n odd
= val, if n even
We must show:
val′ * (x′)n′ = val * xnThere are two cases:
val′ * (x′)n′ = val * (x2)n/2 = val * x2*(n/2) = val * xn
val′ * (x′)n′ = val * x * (x2)n/2 = val * x2*(n/2) + 1 = val * xn
val * x0 = bp, i.e., val = bp
public static int gcd(int m, int n) {
System.out.println( "gcd(" + m + "," + n + ")" );
if (m==0 || m == n) return n;
// so, m > 0 and m < n
return gcd(n % m, m);
}
public static void main(String[] args) {
int m = Integer.parseInt( javax.swing.JOptionPane.showInputDialog("First") );
int n = Integer.parseInt( javax.swing.JOptionPane.showInputDialog("Second") );
if (m > n) { int tmp=m; m=n; n=tmp; } // swap m & n if m > n
System.out.println( gcd(m,n) ); // guaranteed that m <= n
}
In order to argue various points about the algorithm, we need to
know how the integer division operators ( / and % )
are defined.
Assuming 0 < m,
n/m = q and n % m = r
where q and r satisfy:
n = m * q + r where r < m
m = d * x and n = d * yand so,
r = n m * q = d * x d * y * q = d * ( x y * q )showing that d must be a divisor of r; thus d is a divisor of r,m. The converse argument is similar.
r = n % m ≤ flr(n/2)There are two cases:
x = n mThen
x < n flr(n/2) ≤ flr(n/2) + 1and so,
x ≤ flr(n/2) < mWriting the equation:
n = m * 1 + xsince x < m, x must be the division remainder, r (and the quotient is 1), Thus r ≤ flr(n/2)
gcd( m, n )
= gcd( r, m ) where r = n % m
= gcd( r1,r ) where r1 = m % r
So after two steps, n has been replaced by r
which is less than flr(n/2). Thus we can derive the recurrence
relation:
T(0) = 0 T(n) ≤ 2 + T( flr(n/2) ), n > 0We can prove by induction that:
T(n) ≤ 2 * log n, n ≥ 2So we can say that T(n) = O(log n).
Cworst = 1 / log2φ ≅ 1.44where φ=(1+√5)/2 ≅ 1.62 is the so-called golden mean. Vastly more difficult is the analysis of the average case. According to the textbook, this constant is
Cavg = 12 * (ln 2)2/ π2 ≅ .584One final observations is that to truly analyze GCD, you cannot really count 1 for a division. In fact, the number of operations in a division step is linear in the number of digits in the dividend, i.e., the cost of dividing into n actually O(log n).
static class Answer {
public int fromInd;
public int toInd;
public int sum;
public Answer(int fromInd, int toInd, int sum) {
this.fromInd = fromInd; this.toInd = toInd; this.sum = sum;
}
@Override
public String toString() {
return "sum: " + sum + ", from: " + fromInd + " to: " + toInd;
}
}
static Answer maxsubsum(int[] A) {
int maxSum = 0;
int thisSum = 0;
int i = 0, toInd = 0, fromInd = 0;
for (int j = 0; j < A.length; ++j) {
thisSum += A[j];
if (thisSum > maxSum) {
toInd = j;
fromInd = i;
maxSum = thisSum;
}
else if (thisSum < 0) {
thisSum = 0;
i = j+1;
System.out.println("i change: " + i);
}
}
return new Answer(fromInd, toInd, maxSum);
}
public static void main(String[] args) {
int[] A = { -1, 4, -3, -7, 5, -2, 1, 2, 6, -2 };
System.out.println( java.util.Arrays.toString( A ) );
System.out.println( maxsubsum(A) );
}
sum(p,q) = A[p] + A[p+1] + … + A[q-1] // not including index q maxSeqSum(a,b) = max∀i,j:a≤i≤j≤bsum(i,j)So, we want to prove is that, when the loop terminates,
maxSum = maxSeqSum(0,n)The loop invariant is the conjunction (i.e., the logical and) of these clauses:
thisSum = sum(i,j) ∀ p = i…j, sum(i,p) ≥ 0 maxSeqSum(0,n) = max( maxSeqSum(0,i), maxSeqSum(i,n) ) maxSum = maxSeqSum(0,j)With these variable initializations it is trivial to verify that the invariant is true.
i = j = 0; thisSum = 0; maxSum = 0;Taking the iterative step, use prime (′) to represent the values of variables for the next iteration. In particular, j′ = j+1. There are two cases to consider:
maxSum′ = maxSeqSum(0,j)
maxSeqSum(i,n) = max( maxSeqSum(i,j+1), maxSumSeq(j+1,n) )There are two key points to understand this idea: