Expected time bounds for selection - CiteSeerX

Comment

Report 3 Downloads 72 Views

Programming Techniques

G. M a n a c h e r Editor

Expected Time Bounds for Selection Robert W. Floyd and Ronald L. Rivest Stanford University

Let f(i,n) denote the expected n u m b e r of c o m parisons required to select i 0 X. (We assume throughout that all possible input orderings of the set X are equally likely.) Since a selection algorithm must determine, for every t C X, t ~ i 0 X w h e t h e r t < i 0 X or i 0 X < t, we have a s a t r i v i a l l o w e r b o u n d

f(i,n) > n -

1,

for

1 < i < n.

(1)

The best previously published selection algorithm is FIND, by C.A.R. H o a r e [3]. K n u t h [4] has determined the average n u m b e r of comparisons used by FIND, thus proving that

f(i,n) < 2((n Jr- 1)H~ -- (n -k- 3 -- i)H,~_i+l

(2)

-- (i -k- 2 ) H i q- n -q- 3), A new selection algorithm is presented which is shown to be very efficient on the average, both theoretically and practically. The number of comparisons used to select the ith smallest of n numbers is n q- min(i,n--i) q- o(n). A lower bound within 9 percent of the above formula is also derived. Key Words and Phrases: selection, computational complexity, medians, tournaments, quantiles CR Categories: 5.30, 5.39

where H,, =

~

j-1.

(3)

l<j (2rra(1 - a)ez) -1. So except for a finite n u m b e r of comparisons near the end, the probability that any element is i 0 X is at most e. As n ~ ~ , these latter comparisons f o r m a negligible proportion of the total n u m b e r of comparisons made, and their effect on the probability that an average joining comparison will be a key comparison becomes insignificant. We will therefore assume from now on that the probability that either element being compared is i 0 X is zero. To derive Fk(a) we need to compute the probability Communications of the ACM

March 1975 Volume 18 Number 3

that each joining comparison in which the smaller fragment has at most k elements will turn out to be a key comparison. These c o m p a r i s o n s can be divided into two types: those for which both fragments belong to ~k, and those for which only one fragment has k or fewer elements. The first type is somewhat simpler to handle so we shall treat it first, by means of an example. Consider the comparison of the smaller o f a pair of elements x < z, to an isolated element y: Z

(39)

zontal lines to indicate the relative positions o f i 0 X that make x " y a key c o m p a r i s o n : Z

(41)

X X

Y

X

The total probability that x : y turns out to be a key comparison is thus the average probability that x : y is a key c o m p a r i s o n in each of these three cases. This is just (finally!): 3

X

P(x:y

/"

As a result of this comparison, we will end up with either

tz

or

x

(40)

Y The probabilities of these two outcomes are not e q u a l - the first occurs with probability 2/3 while the second occurs with probability 1/3. This happens because the first o u t c o m e is consistent with the two permutations x < y < z and x < z < y, whereas the second o u t c o m e is only consistent with y < x < z. Since each permutation consistent with the input fragments is equally likely, the probability of each o u t c o m e is proportional to the n u m b e r of permutations consistent with that outcome. We must now consider each permutation consistent with the input fragments separately, since to determine whether x : y is a key comparison requires knowing the relative order of x, y, i 0 X, and all elements previously c o m p a r e d to either x or y. Let us consider the permutation x < y < z first, consistent with the first outcome. With respect to i 0 X, these t h r e e e l e m e n t s m a y be in one of four positions. That is, i 0 X m a y be greater than f r o m zero to three of these three elements. In only two of these cases will x : y turn out to be a key comparison : (i) i OX < x < y < z this will be a key comparison for y, (ii)

x

Recommend Documents

Truncation Selection and Gaussian EDA: Bounds for ... - CiteSeerX

Time Bounds for Streaming Problems

Expected Gene Order Distances and Model Selection in ... - CiteSeerX