Anda di halaman 1dari 4
need clarified and detailed derivation of mean and variance of a hyper-geometrie distribution. Ifa box contains NV balls, @ of them are black and N ~ are white, and n number of balls are drawn at random without replacement, then the probability of getting 1 black balls (and obviously n — white balls) is given by the following pan. Tho pmfis ‘The meanis given by: b= Ble) = np = na/N and, variance na(N ~ a)(.V ~ n) n ont) ge Egy [Na where a=1-p=(N—a)/N want the step by step procedure to derive the mean and variance. Thank you. This is a rather old question but itis worth revisiting this computation. Let _ G2 O where I have used m instead of a. We can ignore the details of specifying the support if we use the conventions on binomial coefficients that evaluate to zero; eg., (8) = Oi k ¢ {0,...,m}. Then we observe the identity 2(™) m! m(m —1)! — (mt (2) = sates Soe ee (22): whenever both binomial coefficients exist. Thus Pr[X Caeg ry Fie zPr|X =a] =m: and we see that CGE) gay mys Ee eo and the sum is simply the sum of probabilities for a hypergeometric distribution with parameters N— 1, m~—1,n— Landis equal to 1. Therefore, the expectation is E|X] = mn/N. To get the second moment, consider ae »(2) =m a(n he m(m v(2-3): ‘which is just an iteration of the frst identity we used. Consequently aya) 2) 9 a(e—a)pix a NEN oe) eM a ae) and again by the same reasoning, we find mim —1)n(n— 1) Bae N=" It is now quite easy to see that the "factorial moment” BIX(X—1)...(X- kb a)]= 7°22 oe dD. In fact, we can write this in terms of binomial coefficients aa well: [O)-P This gives us a way to recover raw and central moments; e.g., Var[X] = B(x] —E[x}? = E[X(X — 1) +X) —E[XP = ELX(X— y] + E|X]G— EX), ‘The tials ara not indepandant, but they are identiealy distributed, and indeed, exchangeable, so ‘thatthe covariance between two of them doesn't depend on which two they are, They expected number of black balls on any one trial is a/N, so just add that up n times. x) |, but you also need the covariance between two trials. The probability of getting a black bell on both of the frst tivo trials is, a(a~ 1) NW-1)" ‘The variance for one trial is pq = p(1— p) = + . (1 _ So the covariance is cov(X;, Xa) = E(XiXe) — (E Xi)(E Xa) = Pr(X, = X2 = 1) — (Pr(X a(a—1) ay? ~ N(N=1) ~ Gy) . Add up nvariances and n(n — 1) covariances to get the variance: var(Xi t+ =~ + Xn) = S)var(Xe) + J) cow(%s,X3). 7 ats (Youll need to do a bit of routine algebraic simplification.) m(m—1)n(n—1) | mn mn\ _ mn(N~m)(N~n) vain = ve te) = aa for example, What is nice about the above derivation is that the formula for the expeetation of (1) is very simple to remember.

Anda mungkin juga menyukai