o completely understanding the 2-way - via a small example - do the beads example - see what happens for unequal sample sizes follow-ups Type I and Type III SS MEANS vs. LSMEANS o small example b 1 2 a 1 2,4 4,6 2 12,14 6,8 all Y's are means below Y(11.)=3 Y(12.)=5 Y(21.)=13 Y(22.)=7 Y(...)=7 Y(1..)=4 Y(2..)=10 Y(.1.)=8 Y(.2.)=6 Y(1..)-Y(...) asks how different items in level 1 of A are from all values a(1)=-3 a(2)=3 note sum of a's is zero Y(.1.)-Y(...) asks how different items in level 1 of B are from all values b(1)=1 b(2)=-1 note sum of b's is zero Y(11.)- Y(1..)-Y(.1.)+Y(...)= Y(11.) - (Y(1..)-Y...)+Y(.1.)-Y(...)) - 2Y(...)+Y(...) shows how different items in 1,1 are from all values (after A1 effect and B1 effect removed) ab(11)=3 - (7 -3 +1) = -2 ab(12)=5 - (7-3+1) = 2 note sum of ab's is zero ab(21)= 13 - (7+3+1) = 2 ab(22)=7 - (7-1+3) = -2 note sum ob ab's is zero Note: not estimable so truly only 1 possible solution to minimize sum of errors-squared SSA=4*sum of a's-sqaure = 72 SSB=4*sum of b's-squared = 8 SSAB=2*sum of ab's-squared = 32 also SStot=sume of Y(ijt) - Y(...) squared = 120 o back to the bead example where we have new replicates in each cell - most accepted approach is to first test for interaction - view A*B in SAS (or hand calcs); large p-value is no interaction = good - test test main effects - if interaction is present, ignore SAS printout and do, say, Bonferroni on cell means - Caution: do not rern a 2-way additive if you find no interaction o what are Type I and Type III sums of squares - Type I answers this question: what is the effect due to value 1? what is the effect due to value 1 and value 2? ... (sequential) - Type III answers : what is the effect of value X when all other values are already taken into account? - if value1, value 2, ... independent then Type I = Type III ex: orthogonal, equal sample sizes - Note: last addded value's SS same for Type I and Type III - note: SSA + SSB + SSAB + SSE not equal to SStot o how about followups? - MEANS (use with equal ss) vs. LSMEANS (always ok) - example: see beads with extra (19th) data line - let us compute average for large beads by hand -1.19 -1.46 -1.82 ---> -1.49 -.98 -1.82 -.70 -.98 ----> -1.12 now: ( - 1.49 + -1.13)/2 = -1.28 MEANS now: (-1.19 + ...+ -.98)/7 = -1.31 LSMEANS o Usual formulas for followups (see lec8) - check SAS commands - more complex if sample sizes in cells unequal.....use SAS then - look at SAS example for beads, check - warning: Bonferroni and Scheffe hold alpha-level for unequal sample sizes. Don't know for Tukey and Dunnett o The theory - these formulas and test are sound and work like for 1-way. - recall SS/df is an estimate of sigma2 (the common cell variance). Exact if all means are the same; has extra stuff if not - compare to honest estimate MSE - under assumptions, ratio is F. If big, because sigma2+stuff/sigma2 is, the means are unequal. - Example: SSA E(SSA) = E(br*sum of (Y(i..)-Y(...))-squared) Y(i..) is normal with a mean, and with variance of sigma2/br Y(...) serves as estimate of that mean. so sum of (Y(i..)-Y(...))-squared/(a-1) has an expectation of sigma2. E(SSA)=(a-1)*sigma2 - our estimates for m+a(i) are from minimizing sum of error-squared...remember, Y(...) for m is not unique, not estimable. /* ch 6 2-way anova model... note additional data */ options ls=72; DATA beads; INPUT sizelet $ sizecat finglet $ finglnth fingcat time; htime=log10(tan((time*3.14159/180)/2)-.3); code=sizecat*10+fingcat; letcode=trim(sizelet)||trim(finglet); LINES; s 1 t 64 1 100 s 1 a 72 2 140 . . l 3 t 65 1 44 l 3 a 75 2 35 l 3 b 77 3 53 ; proc glm; classes sizecat fingcat; model htime=sizecat fingcat sizecat*fingcat; Means sizecat; MEANS sizecat / scheffe CLDIFF ALPHA=0.10; /* means only ok with equal ss */ /* ch 6 2-way anova */ F S S F I F L I I I N I E N Z Z N G N H T Y S E E G L G T T C C P C O L C L N C I I O O R O B E A E T A M M D D E R S T T T H T E E E E D Z E 1 s 1 t 64 1 100 -0.04976 11 st 0.04782 -0.30454 -0.20750 2 s 1 a 72 2 140 0.38872 12 sa 0.31031 0.24471 0.06873 3 s 1 b 76 3 93 -0.12276 13 sb -0.26024 0.42906 0.66375 4 m 2 t 70 1 85 -0.21019 21 mt -0.50891 0.93230 1.06324 5 m 2 a 71 2 45 -0.94228 22 ma -0.32064 -1.94009 -1.82175 6 m 2 b 79 3 105 0.00140 23 mb -0.09877 0.31260 0.35041 7 l 3 t 70 1 40 -1.19402 31 lt -1.08844 -0.32952 -0.50090 8 l 3 a 75 2 37 -1.46099 32 la -1.63817 0.55297 0.84652 9 l 3 b 77 3 35 -1.81535 33 lb -1.25871 -1.73725 -1.34668 10 s 1 t 63 1 119 0.14540 11 st 0.04782 0.30454 0.20750 11 s 1 a 72 2 127 0.23190 12 sa 0.31031 -0.24471 -0.06873 12 s 1 b 79 3 70 -0.39772 13 sb -0.26024 -0.42906 -0.66375 13 m 2 t 69 1 49 -0.80764 21 mt -0.50891 -0.93230 -1.06324 14 m 2 a 71 2 133 0.30099 22 ma -0.32064 1.94009 1.82175 15 m 2 b 80 3 86 -0.19893 23 mb -0.09877 -0.31260 -0.35041 16 l 3 t 65 1 44 -0.98286 31 lt -1.08844 0.32952 0.50090 17 l 3 a 75 2 35 -1.81535 32 la -1.63817 -0.55297 -0.84652 18 l 3 b 77 3 53 -0.70206 33 lb -1.25871 1.73725 1.34668 General Linear Models Procedure Dependent Variable: HTIME Sum of Mean Source DF Squares Square F Value Pr > F Model 8 6.8273636 0.8534204 4.40 0.0201 Error 9 1.7453401 0.1939267 Corrected Total 17 8.5727037 R-Square C.V. Root MSE HTIME Mean 0.796407 -82.29960 0.4404 -0.5351 Class Levels Values SIZECAT 3 1 2 3 FINGCAT 3 1 2 3 Number of observations in data set = 18 General Linear Models Procedure Dependent Variable: HTIME Sum of Mean Source DF Squares Square F Value Pr > F Model 8 6.8273636 0.8534204 4.40 0.0201 Error 9 1.7453401 0.1939267 Corrected Total 17 8.5727037 R-Square C.V. Root MSE HTIME Mean 0.796407 -82.29960 0.4404 -0.5351 General Linear Models Procedure Dependent Variable: HTIME Source DF Type I SS Mean Square F Value Pr > F SIZECAT 2 6.0157662 3.0078831 15.51 0.0012 FINGCAT 2 0.0034205 0.0017103 0.01 0.9912 SIZECAT*FINGCAT 4 0.8081769 0.2020442 1.04 0.4374 Source DF Type III SS Mean Square F Value Pr > F SIZECAT 2 6.0157662 3.0078831 15.51 0.0012 FINGCAT 2 0.0034205 0.0017103 0.01 0.9912 SIZECAT*FINGCAT 4 0.8081769 0.2020442 1.04 0.4374 Level of ------------HTIME------------ SIZECAT N Mean SD 1 6 0.03263161 0.28119500 2 6 -0.30944128 0.47749510 3 6 -1.32843893 0.45201251 /* ch 6 2-way anova model... note additional data */ /* add in one extra 33 let them see T1 and T3 ss */ options ls=72; DATA beads; INPUT sizelet $ sizecat finglet $ finglnth fingcat time; htime=log10(tan((time*3.14159/180)/2)-.3); code=sizecat*10+fingcat; letcode=trim(sizelet)||trim(finglet); LINES; s 1 t 64 1 100 s 1 a 72 2 140 s 1 b 76 3 93 m 2 t 70 1 85 m 2 a 71 2 45 . . l 3 b 77 3 53 l 3 b 78 3 49 ; proc glm; classes sizecat fingcat; model htime=sizecat fingcat sizecat*fingcat; means sizecat; /* do not use means on unequal sample sizes -- this is for illustration only */ lsMeans sizecat; lsmeans sizecat fingcat/pdiff=all cl adjust=tukey alpha=.10; /* add in one extra 33 let them see T1 and T3 ss */ /* means only ok with equal ss */ /* ch 6 2-way anova */ General Linear Models Procedure Class Level Information Class Levels Values SIZECAT 3 1 2 3 FINGCAT 3 1 2 3 Number of observations in data set = 19 General Linear Models Procedure Dependent Variable: HTIME Sum of Mean Source DF Squares Square F Value Pr > F Model 8 6.7620997 0.8452625 4.49 0.0151 Error 10 1.8809812 0.1880981 Corrected Total 18 8.6430809 R-Square C.V. Root MSE HTIME Mean 0.782371 -78.93716 0.4337 -0.5494 General Linear Models Procedure Dependent Variable: HTIME Source DF Type I SS Mean Square F Value Pr > F SIZECAT 2 5.8536587 2.9268293 15.56 0.0009 FINGCAT 2 0.0202374 0.0101187 0.05 0.9479 SIZECAT*FINGCAT 4 0.8882036 0.2220509 1.18 0.3765 Source DF Type III SS Mean Square F Value Pr > F SIZECAT 2 5.9644815 2.9822408 15.85 0.0008 FINGCAT 2 0.0115865 0.0057932 0.03 0.9698 SIZECAT*FINGCAT 4 0.8882036 0.2220509 1.18 0.3765 General Linear Models Procedure Level of ------------HTIME------------ SIZECAT N Mean SD 1 6 0.03263161 0.28119500 2 6 -0.30944128 0.47749510 3 7 -1.25403899 0.45717634 General Linear Models Procedure Least Squares Means SIZECAT HTIME LSMEAN 1 0.03263161 2 -0.30944128 3 -1.27832036