---start biostat 1.5.97--- [it's the same old same old. feh. *sigh* and some luser drove really fast through a big puddle right next to me this morning so my legs are all wet. blargh.] [hmm. no one is here....this is strange. did everyone melt out there?] HANDOUT 5 handout. biggest of the handouts. we are advised to read it. instructor says he wants to go over again the last thing we did - the sampling theory thing. in the election example, when all the votes are finally counted, the ACTUAL piYES and piNO will be figured out. BUT the night before, after polls close, a sample of n = 250 is drawn, and p.yes is estimated to be 33%, and p.no to be 67%. the confidence interval says if you take observed p.yes and subtract from it a z value you pick times the estimated std deviation of p.yes/sq.root of n p.yes +/- z(std.dev/root n) so +/- 1.96 (root .224/250) so +/- .0583 so if you produce the confidence interval it turns out that p[.275 ¾ piYES ¾ .392] is about 95%. so the REAL pi is 95% sure to be somewhere in between .275 and .392 - that confidence interval means that 95% of all samples we could have drawn will trap piYES, and 5% would not have. now, this confidence interval falls FAR below the number which you'd need to call this a yes vote. you'd need to see an interval including values OVER 0.50 - and that didn't happen here. and this is basically what they do the night of the election. if the intervals exclude .5 on the down side, it's a loss. if they exclude .5 on the HIGH side, it's a win. to make this more precise, you make the sample size larger - that makes you add and subtract a smaller number, and gives you a tighter interval. but in this example, we've been precise enough. but if sample results were more like .45 or .47, you'd want to make the interval considerably more narrow - so you'd use a bigger sample size. and this is why sometimes on election nights, when %s are really close to eachother, they don't call the election early in the evening, because their intervals include 0.50 within them. but later on as sample size increases, they can exclude .5 and call the election. re: notation. p = estimated ¼. could also write ¼ with ^ as a hat over it. Now, you could do this differently you could say, make a null and alternative hypotheses. given n = 250 Ho = piYES „ 0.51 Hi = PiYES < 0.51 one of these states of nature will be correct. so you make a plot of Ho - a bell curve Z = p.yes - piYES ------------- std.dev of pi /root n std dev of pi = root of [pi(1-pi] when you hypothesize, you always use the std deviation of the proportion that's been hypothesized now, 0.33 was our observed value and n = 250 so.... Z = .33 - .51 ------------ the root of (.51 x .49) / 250 so Z = 5.70 (eg, this is way out in the tail of normal curve indicating a very LOW probability) p < 1 x 10^7 eg, the chances are slim that this could be a win, when 33% of your sample is all that's voting yes. you would reject the null, here. so, again...what's the chance of observing 1/3 of folks voting yes, when in fact, more than .5 voted yes? well it's very SMALL as we've just shown. if probability is so small, it shouldn't be surprising that confidence interval with such a high probability (95%) excluded 0.5 - this is another way of showing that those results are unlikely. we are now starting handout 5. it may take several hours to get through it and there is a good chance that our second takehome question will be very related to this handout. we are not going through the handout in exact order in which things appear. we will use examples from handout and from elsewhere. handout 5 starts out w/summary and philosophy: so far we've discussed hypothesis testing and confidence intervals and there are two general categories: randomized assignment expt's and random sampling theory. in the former, you control the study, you assign people/animals at random to be in categories, give them drugs, whatever. in the latter, you don't have that control. you don't assign anyone to anything, you jsut sample things as they exist in nature. in either case, we have relied on this concept of either in the population being sampled or that created by randomization: Xi is proportional to N(mu,sigma^2) or bar.Xi is proportional to N(mu,sigma^2/n) see front page of handout. if this is not true, then you can't do this. if when sampling or doing randomized expt, if original observations are not normal, OR if sample size is so small that you can't assume that sampling distribution is normal, than you can not use Z or t, because they assume one or the other is true. that's the grounds for the assumptions in this model. first example is from handout "hypothesis testing", handout 5. the procedure we use is as follows: a population Pop is under consideration. avg is mu, std dev is sigma we draw sample of size n from large Pop. the situation is we assume a null: Ho: mu(actual) = mu(hypothesized) Hi: mu(actual) ‚ mu(hypothesized) Z = bar.X - mu(hyp) -------------- sigma/root n or t = bar.X - mu(hyp) --------------- s/root n. for samples that are very small,stick with t, but if large, use z. they are very close, and there isn't a real reason to be concerned w/large samples. this is a way to test theories based on metrics instead of proportions. standard deviation of metric s(n-1) = root of [sum from 1 to n of (Xi - bar.X)^2] ------------------------------------ n-1 the handout has elaborate explanations, which is good, because this isn't really making any sense. the example starts in middle of p 2 and says, in a large herd of cows, avg # cells/mL of milk - for many years in this large herd in central PA for many years the level of mastitis has been 3.8 x 10^5 cells/mL milk. health dept says that's too high, tells farmer to install new management program. they do this for a while. this is a big herd, w/thousands and thousands of cows. a year later, someoen says "have we helped by bringing this down?" so they sample 400 cows calculations on bottom p 2. barX (observed average) = 2.5 x 10^5 s^2(n-1) (observed variance) = 81 s(n-1) (observed std deviation) = 9.0 SO, we can now do a t test. one might mention though - does it matter if you use t or z when you have 399 degrees of freedom? no. but we'll do it anyway.. t(399) = 2.5 - 3.8 ---------- = -2.88 .45 see top of p 3. [yikes. my back hurts :(] note figs 5.1 and 5.2 and 5.3 in handout. the t distribution is telling us that the p level is less than .01 this means that the chance of observing a sample avg of 2.5 x 10^5 cells/mL when in fact the population was still centered at the OLD avg (3.8), is very small. so it is very LIKELY that we've lowered the cell count in the milk. see, the t test compared the new value to the old value and told us there is minimal likelihood that the milk still has a high number of cells in it. it seems as if health measures have helped. now, there is a 1% chance that it didn't help, but... ---break--- backing up for a minute: what does the 2.88 mean? well our hypothesized null stated that there was NO CHANGE due to the new health measures. out of our 80000 cow herd, we tested 400. it there had truly been NO CHANGE, our average would still be centered at 3.8 x 10^5 cows. well, our average seems to be at 2.5 x10^5 cells, not at 3.8. we calculate the standard error to be .45 (observed std deviation divided by n, take sq root of whole thing). then we calculate a t value and we find that chance of avg still being the old avg when we calculated the new avg is VERY SMALL. only way to be SURE is to sample all 80,000 cows (hmm. what if you sample 79,999 cows? isn't that statistical "proof"? where do you draw the line?) now...if you do all these funky calculations, which you can review in the handout, you get: 1.62 ¾ mu(actual) ¾ 3.38 has a probability of 95%. now, the OLD value was 3.8, which is excluded from this confidence interval. so you know with 95% confidence that your new value is in a lower range than the old value. now, if you need more precision, eg a narrower range, you have to raise sample size - but 400 cows is a lot of cows. this has been the "one sample" example, brought to you by bessie the milk cow, the letter "m" and the number 80,000. we're skipping p 4 to come back to it later. we're moving on to the so called "two sample" example. this is just what it says it is. it concerns hypothetical and practical situations where you're drawing TWO separate samples not comparing one sample to a constant. the two sample example is the one usually encountered in real research. the single sample is more or less a rarity. usually people are testing two or more things against each other . if more than two gets complex, ignore that for now. two things against each other can happen in 2 ways... one- draw sample from group a, and from group b, and make comparison. these are two physically different populations. other, under randomized assignment, assign subjects into two groups and see if responses to treatment are same. these are mathematically identical they're a bit different by design, but mathematically identical. TWO GROUPS Ho: mu1 = mu2 Hi: mu1 ‚ mu2 if null is true, average responses in both groups are equal. in alternative, they aren't equal. so....the usual suspects are involved...a sample from pop1 and a sample from pop2, and we infer bar.X1 and s^2(n1-1) and bar.X2 and s^2(n2-2) n1/pop1-----> bar.X1 and s^2(n1-1) n2/pop2----->bar.X2 and s^2(n2-2) t(df) = Z = (bar.X1 - bar.X2) - (mu1 - mu2) ------------------------------- s (bar.x1-bar.x2) takes on two forms. if n1 = n2, s (bar.x1-bar.x2) = the sq root of [ (sq of std error for group one - sq std error for group 2)] when they come from two different populations as in case a, they are obviously totally independent, so std error of difference is merely sum of two std errors sqd, sq root of. on occasion the sample sizes will differ, and equation appears to change substantially. this is in the handout. s(bar.x1-bar.x2) = square root of: {[ (n1-1)s1^2 + (n2-1)s2^2/ (n1 + n2 -2) ][ (1/n1) + (1/n2)]} now. t or z notwithstanding...if null is true, and two separate populations truly have equal averages, if mu1=mu2, and you check this out a gazillion times, each time you draw a sample and take the difference between the sample averages, it should be zero. the alternative is that it is NOT zero...that it is something else. so we go in, draw the samples, observe the difference. if we get an actual difference, it seems the null isn't true.... this example is not from the notes. this is one from his personal notes. there is an example in the notes, but this one is different: note: we make these assumptions: 1. for small samples, sigma1^2 = sigma2^2 and there must be joint normality of the samples, in order for t distribution to be accurate. these are unlikely conditions. but fear not. 2. for larger samples, you don't need equal variances or joint normality. studies indicate that for reasonably large samples, even w/o equal variances, the t thing makes good approximation. and the joint normality is very likely because large samples tend to approach the normal anyway. this isn't a concrete rule, of course, but even w/worst case scenario, instead of 30 degrees of freedom you'd need 50 or 60. big deal. but with small samples and non-normality you can't rely on these procedures, is the point here. however, the t test is invoked over and over again even w/small samples without any concern for these conditions, and in most cases it is valid. now, if your groups are very small, like 12 and 8, maybe you should be concerned. but over 20, minimal concern, over 30 REALLY minimal concern, and over 60 don't even let worry cross your mind. back to mastitis. herd1 and herd2 (in the notes, it's on dogs w/periodontal dz) two different mgmt procedures are being used, and farmers wanna know which is better. Ho: mu1 = mu2 Hi: mu1 ‚ mu2 so we draw samples n1 = 5 = n2 bar.x1 = 5 x 10^5; bar.x2 = 4.5 x 10^5 s^2 (n1-1) = 1.828, s^2(n2-1) = .0984 (variances) so, t8 = 5.0 - 4.5 ------------ (1.828/5) - (.0984/5)<---sq root of diff t8 = .5/.75 = 0.67 (note: it's t8 because 8 deg of freedom...5-1 and 5-1 is 8...) p approximates 0.52 (tt - two tails) so t value of 0.67 is fairly close to center...so each side has probability of about 0.26 so joint prob is 52%. so here we could not reject the null. the null is likely to be true...but you couldn't say "it's definitely true" what if someone suggests that .5 x 10^5 is clinically meaningful, and that we should try for more accuracy? we could have missed this. we observed a clinically meaningful difference but found null probably true..but we used REALLY small sample sizes. we should try again with bigger samples. if we still observe a difference of .5, with bigger sample, the denominator goes from having those 5s in it to having bigger numbers, so denom gets smaller (see formula above), so t value gets BIGGER, and probability that null is true gets SMALLER ---end---