---start biostat.lec.02.10.97--- there was a prize fight between two big dumb guys when instructor was 9 y.o. the loser was on his knees, yelling "i'm up, i'm up" but he was actually NOT up he was on his knees so he lost. okaaaaaayy........ going back to previous lecture we had our two groups each n=5 bar x1 = 5, bar x 2 = 4.5 s(n1) = 1.828, s(n-2) = 0.984 (this is the mastitis example) we ran the T test and got t8 = 0.67, p>.5 so you would NOT reject the null unless you were a real optimist :) [already i have no idea...] so last time he made a power calculation just as the hour ended that in order using this standard error to find a signif. diff at this level you would have needed a diff of 1.47x10^5 cells, because to get to proper value we leave the t distribution to make power statements and go back to the normal.it would be harder to calculate power using the t distribution.for power calculations we are obliged ONLY to consider the normal. forget t for power calculations. now, do you understand how he got the power calculation result from last time? bar x1 minus bar x2 = 1.47/.75 =1.96 (.75 = denominator of t test = variance) now, if exp't resulted in same variance, what difference would you have needed (besides the actual 0.5) to get a t value of near two? z = y/.75 = 1.96 and therefore unknown y = 1.47 which is about 3x the observed difference. he then showed someting about something that only had 10% power. i have no idea what exactly it was. sorry. if anyone who reads this UNDERSTANDS this stuff, please advise. so, you would want to design an expt that with difference of .5 or greater would have decent power... so what values would you need, besides 5 and 4.5? we're going to use continuous curves, and we need a single estimate of variability. if we multiply each estimate of variablity by the degrees of freedom and divide by 8, you get 1.406 so that is the estimate of variability inherent in the model. and now you can proceed. we're gonna make assumptions in order to simplify. the standard error as seen in handout can be complex so we'll shorten it to make assumptions. if n1=n2=n, the variance of the difference between two means as a known value would be sigma squared (one) over n plus sigma squared (two) over n. then you make assumption two...that sigma squared one = sigma squared two, so youcan rewrite the above to be 2(sigma squared)/n. so we're assuming these equalities. now sigma (barx1-barx2)= root (2sigma^2/n) = standard error happily, the person next to me (hi scout) has no idea what he's talking about either. he's making a side by side depiction of these distributions.both are normal looking curves with an overlap area between them. he says that the drawing is the same thing we just turned in as homework, except it has the null and the alternative side by side, using continuous curves instead of histograms. if two populations are centered around the same mean, your experimental barx1-barx2 over time will always be zero if the null is true. if the null is not true, eg if the means are NOT equal but are in fact DIFFERENT by say, 0.5? eg, Ho: mean 1 = mean 2 Hi: mean 1 - mean 2 = 0.5 in his diagrammed distribution, there is a point on the graph which is undefined but which is important in some obscure way. he's reiterating that this is the same as the homework...personally, i don't see it. now, we're going to solve the problem. note it is already solved for n=5. in that case beta is like 90% and 1-beta is like 10% - that's a low power and a huge error. there is a separate sample of this in the handout, totally separate from this example. note that as a designation of what curve we're using, the null is designated alpha and the alternative is designated beta. Zalpha = barX1-barX2/root of[2sigma^2/n] if the alternative were true: Zbeta = barX1-barX2 - mu1-mu2 ---------------------- root of[2sigma^2/n] note that if the means are equal, that factor converts to zero, and you have the z alpha equation. z values are chosen. the variances are calculated estimates. we need to find n, the sample size, and the difference between the means. now, what's an easy way to solve for this? well, we really need n, we don't care about the diff between the means, we need to know what size samples we need to use. so you rewrite to solve for xbar1-xbar2 and then set them equal to each other (using alegbraic substitution) and solve for n. he's using the "approximating two sample equation" so you have an estimate of sigma squared, you have a supposed difference you want to find (we're using 0.5 as mu1-mu2) and then you choose z values for alpha and beta. we want beta = 0.2 and 1-beta = .8 (arbitrarily chosen). now look at H1 panel. the z value to give us a .2 in the left tail of normal curve is on the left of center for that curve, so it must be negative. so then solve for n and find that you need 88 subjects PER GROUP. this is a lot more than 5! how do we bring sample size down? well we can't shrink the variability, it's inherent in the native variability between cows. maybe you can measure more precisely, bringing it down a little, which WOULD bring down sample size. now, you could mess w/alpha and beta - could make alpha larger or 1-beta smaller. for the moment, if you keep the curves as drawn - what determines the overlap is the spread of the curves, which is a function of standard error. if you raise n even more, curves shrink, and overlap becomes less, power gets higher, beta gets lower, and alpha/2 gets still lower. so only thing left to do is look at denominator. sayyou can't change variance and don't want to mess w/alpha or beta. don't want to raise sample size, only thing left to do is redefine what's worth finding and this is in fact done alot. smaller differences are harder to find. bigger ones are easier to find. if youraise the denominator to not .5 but to 1 or to 2, the whole denominator gets bigger,and sample size n gets smaller. so if Q : if i design an expt against .5, am i also covered at larger differences? A: YES! Once you pick the mu1-mu2, you are protected against that difference OR A BIGGER DIFFERENCE.but if it were really a lower difference, you would be wrong. ---end---