Plsc 734
Spring 2005
Final Exam
1) (25 pts)
a)From your area of interest, provide a detailed example of the use of confounding with a 3x3x3 factorial. I am interested in estimating all effects. Indicate what the levels of each factor would be (3 varieties or whatever). Detail the layout-randomization-AOV. I want to see what treatment combinations are associated with each experimental unit for at least 1 replication that does not just confound a main effect.
b) For each of the following four, indicate what effects, if any, has been confounded with ranges.
1)
Range
0 000 210 020 230 301 111 321 131 202 012 222 032 103 313 123 333
1 100 310 120 330 001 211 021 231 302 112 322 132 203 013 223 033
2 200 010 220 030 101 311 121 331 002 212 022 232 303 113 323 133
3 300 110 320 130 201 011 221 031 102 312 122 332 003 213 023 233
2)
Range
0 000 210 020 230 001 211 021 231 002 212 022 232 003 213 023 233
1 100 310 120 330 101 311 121 331 102 312 122 332 103 313 123 333
2 200 010 220 030 201 011 221 031 202 012 222 032 203 013 223 033
3 300 110 320 130 301 111 321 131 302 112 322 132 303 113 323 133
3)
Range
0 000 100 200 300 021 121 221 321 002 102 202 302 023 123 223 323
1 010 110 210 310 031 131 231 331 012 112 212 312 033 133 233 333
2 020 120 220 320 001 101 201 301 022 122 222 322 003 103 203 303
3 030 130 230 330 011 111 211 311 032 132 232 332 013 113 213 313
4)
Range
0 000 100 200 300 021 121 221 321 032 132 232 332 013 113 213 313
1 010 110 210 310 031 131 231 331 002 102 202 302 023 123 223 323
2 020 120 220 320 001 101 201 301 012 112 212 312 033 133 233 333
3 030 130 230 330 011 111 211 311 022 122 222 322 003 103 203 303
2) (50 pts) Given the attached SAS statements and SAS output assume that I am willing to test at 10% probability level:
a) Fill in the missing df, MS and F statistic.
b) Can one regression of yield on myield be used for all data? Explain!
c) Do all CIs response the same to myield? Explain! Which ones might be different from say 1.0 or each other or ????
d) Show how to present the information in a report provide a VERY brief statement to include.
3) (25 pts) The following information was sent
recently to a couple faculty members in plant science:
Good
morning, gentlemen. Have a quick question for you. Am assisting in the analysis
of some data for Dr. Carr here in
thanks and look forward to any
and all response. Sincerely, Cp
Proc GLM data=work;
classes year location rep
variety;
model bua12 tw12 seedlb prot12
= year location year*location rep(year*location) variety variety*location;
test h=location
e=year*location;
test h=year
e=rep(year*location);
test h=year*location
e=rep(year*location);
lsmeans location/stderr pdiff
e=year*location;
lsmeans year*location/stderr
pdiff e=rep(year*location);
lsmeans variety
variety*location/stderr pdiff;
run;
a) Key out the sources, df and F tests as proposed by Cp.
b) Key out the sources, df and F tests you would recommend.
c) Suggest what might be wrong in the assumptions made by Cp in setting up his analysis (ie. Why your assumptions?).
d) What would suggest to improve in future work in this area or to handle the real set of data with different varieties in each location/year?
PROC IMPORT OUT= WORK.one
DATAFILE= "E:\urn2004\SD.xls"
DBMS=EXCEL2000 REPLACE;
GETNAMES=YES;
PROC SORT;BY LOC DATE;
DATA THREE;SET ONE;IF
CI="FP1095" OR CI="FP1096" OR CI="FP1097" OR
CI="FP1098" OR CI="FP1099" THEN DELETE;
PROC MEANS NOPRINT;BY LOC
DATE;VAR YIELD;OUTPUT OUT=REG MEAN=MYIELD;
DATA TWO;MERGE THREE REG;BY
LOC DATE;
PROC GLM DATA=TWO;MODEL
YIELD=MYIELD;
PROC GLM;CLASS CI;MODEL
YIELD=CI MYIELD CI*MYIELD/SOLUTION;
PROC GLM;CLASS CI;MODEL
YIELD=CI MYIELD(CI)/SOLUTION;
RUN;
SAS output:
The GLM Procedure
Dependent
Variable: YIELD YIELD
Sum of
Source DF Squares Mean Square F Value
Pr > F
Model 82694475.19 <.0001
Error 376 14114729.42 37539.17
Corrected Total 96809204.61
R-Square Coeff Var Root MSE YIELD Mean
0.854201 11.03201 193.7503 1756.256
Source DF Type I
MYIELD 82694475.19 <.0001
Source
MYIELD 82694475.19 <.0001
Standard
Parameter Estimate Error t Value
Pr > |t|
Intercept -0.000000000 38.72325404 -0.00
1.0000
MYIELD 1.000000000 0.02130611 46.93
<.0001
The GLM Procedure
Class Level
Information
Class
Levels Values
CI
40 CI 389 CI2522 CI2921 CI3096
CI3259 CI3270 CI3296 CI3297 CI3318 CI3327 CI3332
CI3353 CI3358
CI3397 CI3399 CI3404 CI3411 CI3423 CI3424 CI3425 FP1094 FP2024
FP2044 FP2102 FP2107 FP2112
FP2114 FP2118 FP2119 N0010 N2007 N2010 N2010B
N2010Y N2014 N305 N320 N323
N325 N9719
Number of observations 378
The GLM
Procedure
Dependent
Variable: YIELD YIELD
Sum of
Source DF Squares Mean Square F Value
Pr > F
Model 87129042.89 <.0001
Error 9680161.72
Corrected Total 377 96809204.61
R-Square
Coeff Var Root MSE YIELD Mean
0.900008 10.26231 180.2325 1756.256
Source DF Type I
CI 4667529.59 <.0001
MYIELD 80721450.58 <.0001
MYIELD*CI 1740062.72 0.0762
Source
CI 1630283.41 0.1270
MYIELD 70985188.11 <.0001
MYIELD*CI 1740062.72 0.0762
Standard
Parameter Estimate Error t Value
Pr > |t|
Intercept 247.6285437 B 202.5515626 1.22
0.2225
CI CI 389 148.5076076 B 285.5318421 0.52
0.6034
CI CI2522 -429.5526375 B 285.5318421 -1.50
0.1335
CI CI2921
-685.2639745 B
285.5318421 -2.40 0.0170
CI CI3096 -283.5539956 B 337.4755132 -0.84
0.4015
CI CI3259 -292.2969653 B 296.4311958 -0.99
0.3249
CI CI3270
-62.3443535 B 429.1387825 -0.15
0.8846
CI CI3296 -292.8125046 B 429.1387825 -0.68
0.4956
CI CI3297 -337.3309699 B 295.7373148 -1.14
0.2549
CI
CI3318 -304.6532979 B 429.1387825 -0.71
0.4783
CI CI3327 -149.5784968 B 286.4511670 -0.52
0.6019
CI CI3332 12.7546216 B 429.1387825 0.03
0.9763
CI CI3353 -237.8931342 B 296.4311958 -0.80
0.4229
CI CI3358 -45.7840100 B 296.4311958 -0.15
0.8774
CI CI3397 -435.2179895 B 286.4511670 -1.52
0.1297
CI CI3399 -881.4329535 B 429.1387825 -2.05
0.0409
CI CI3404 -444.0878790 B 296.4311958 -1.50
0.1352
CI CI3411 -429.6075402 B 286.4511670 -1.50
0.1347
CI
CI3423 -202.8770546 B 296.4311958 -0.68
0.4943
CI CI3424 -81.0557309 B 286.4511670 -0.28
0.7774
CI CI3425 -504.8586893 B 286.4511670 -1.76
0.0790
CI
FP1094 -231.8712486 B 347.5157316 -0.67
0.5051
CI FP2024 422.7024226 B 295.7373148 1.43
0.1540
CI FP2044 -170.6113760 B 286.4511670 -0.60
0.5519
CI
FP2102 -230.1674082 B 359.1959950 -0.64
0.5222
CI FP2107 -381.4768045 B 359.1959950 -1.06
0.2891
CI FP2112 -109.2959995 B 285.5318421 -0.38
0.7022
CI
FP2114 -586.3033210 B 285.5318421 -2.05
0.0409
CI FP2118 -534.7015999 B 285.5318421 -1.87
0.0621
CI FP2119 -236.5543681 B 285.5318421 -0.83
0.4081
CI N0010 -61.5452902 B 286.4511670 -0.21
0.8300
CI N2007 -289.5237181 B 317.8808187 -0.91
0.3631
CI N2010 -169.5133808 B 317.8808187 -0.53
0.5943
CI N2010B -757.4554914 B 337.4755132 -2.24
0.0255
CI N2010Y -839.6474827 B 337.4755132 -2.49
0.0134
CI N2014 -218.6600408 B 317.8808187 -0.69
0.4921
CI N305 163.3258906 B 317.8808187 0.51
0.6078
CI N320 72.8544734 B 317.8808187 0.23
0.8189
CI N323 -301.6254559 B 317.8808187 -0.95
0.3435
CI N325 -119.0291690 B 317.8808187 -0.37
0.7083
CI N9719 0.0000000 B . . .
MYIELD 0.8615456 B 0.1115983 7.72
<.0001
MYIELD*CI CI 389 -0.1825083 B 0.1562952 -1.17
0.2439
MYIELD*CI CI2522 0.2131881 B 0.1562952 1.36
0.1736
MYIELD*CI CI2921 0.4120246 B 0.1562952 2.64
0.0088
MYIELD*CI CI3096 0.1997460 B 0.1976688 1.01
0.3131
MYIELD*CI CI3259 0.0657280 B 0.1613117 0.41
0.6840
MYIELD*CI CI3270 0.1048935 B 0.2144361 0.49
0.6251
MYIELD*CI CI3296 0.1522544 B 0.2144361 0.71
0.4782
MYIELD*CI CI3297 0.1758160 B 0.1599726 1.10
0.2726
MYIELD*CI CI3318 0.1697239 B 0.2144361 0.79
0.4293
MYIELD*CI CI3327 0.0482491 B 0.1578238 0.31
0.7600
MYIELD*CI CI3332 -0.0545162 B 0.2144361 -0.25
0.7995
MYIELD*CI CI3353 0.1956562 B 0.1613117 1.21
0.2261
MYIELD*CI CI3358 0.0834474 B 0.1613117 0.52
0.6053
MYIELD*CI CI3397 0.2884224 B 0.1578238 1.83
0.0686
MYIELD*CI CI3399 0.4382993 B 0.2144361 2.04 0.0418
MYIELD*CI CI3404 0.2087015 B 0.1613117 1.29
0.1967
MYIELD*CI CI3411 0.2698343 B 0.1578238 1.71
0.0884
MYIELD*CI CI3423 0.1541363 B 0.1613117 0.96
0.3401
MYIELD*CI CI3424 0.1066430 B 0.1578238 0.68
0.4997
MYIELD*CI CI3425 0.2950127 B 0.1578238 1.87
0.0626
MYIELD*CI FP1094 0.1431050 B 0.1871151 0.76
0.4450
MYIELD*CI FP2024 -0.1615169 B 0.1599726 -1.01
0.3135
MYIELD*CI FP2044 0.0613072 B 0.1578238 0.39
0.6980
MYIELD*CI FP2102 0.0712855 B 0.2042852 0.35
0.7274
MYIELD*CI FP2107 0.2230787 B 0.2042852 1.09
0.2757
MYIELD*CI FP2112 0.0887166 B 0.1562952 0.57
0.5707
MYIELD*CI FP2114 0.2517335 B 0.1562952 1.61
0.1083
MYIELD*CI FP2118 0.2340180 B 0.1562952 1.50
0.1354
MYIELD*CI FP2119 0.1132422 B 0.1562952 0.72
0.4693
MYIELD*CI N0010 0.0601058 B 0.1578238 0.38
0.7036
MYIELD*CI N2007 0.1220413 B 0.1803941 0.68
0.4992
MYIELD*CI N2010 0.0603191 B 0.1803941 0.33
0.7383
MYIELD*CI N2010B 0.4893616 B 0.1976688 2.48
0.0139
MYIELD*CI N2010Y 0.5058902 B 0.1976688 2.56
0.0110
MYIELD*CI N2014 0.0864697 B 0.1803941 0.48
0.6320
MYIELD*CI N305 -0.1092757 B 0.1803941 -0.61
0.5451
MYIELD*CI N320 -0.0177345 B 0.1803941 -0.10
0.9218
MYIELD*CI N323 0.2584353 B 0.1803941 1.43
0.1530
MYIELD*CI N325 0.1099526 B 0.1803941 0.61
0.5426
MYIELD*CI N9719 0.0000000 B . . .
NOTE:
The X'X matrix has been found to be singular, and a generalized inverse was
used to solve
the normal equations. Terms whose estimates are followed by the
letter 'B' are not
uniquely estimable.
The GLM
Procedure
Class Level
Information
Class
Levels Values
CI
40 CI 389 CI2522 CI2921 CI3096
CI3259 CI3270 CI3296 CI3297 CI3318 CI3327 CI3332
CI3353 CI3358
CI3397 CI3399 CI3404 CI3411 CI3423 CI3424 CI3425 FP1094 FP2024
FP2044 FP2102 FP2107 FP2112
FP2114 FP2118 FP2119 N0010 N2007 N2010 N2010B
N2010Y N2014 N305 N320 N323 N325
N9719
The GLM
Procedure
Dependent
Variable: YIELD YIELD
Sum of
Source DF Squares Mean Square F Value
Pr > F
Model 87129042.89 <.0001
Error 9680161.72
Corrected Total 96809204.61
R-Square Coeff Var Root MSE YIELD Mean
0.900008 10.26231 180.2325 1756.256
Source DF Type I
CI 4667529.59 <.0001
MYIELD(CI) 82461513.30 <.0001
Source
CI 1630283.41 0.1270
MYIELD(CI) 82461513.30 <.0001
Standard
Parameter
Estimate Error t Value
Pr > |t|
Intercept 247.6285437 B 202.5515626 1.22
0.2225
CI CI 389 148.5076076 B 285.5318421 0.52
0.6034
CI CI2522 -429.5526375 B 285.5318421 -1.50
0.1335
CI CI2921 -685.2639746 B 285.5318421 -2.40
0.0170
CI CI3096 -283.5539956 B 337.4755132 -0.84
0.4015
CI CI3259 -292.2969653 B 296.4311958 -0.99
0.3249
CI CI3270 -62.3443535 B 429.1387825 -0.15
0.8846
CI CI3296 -292.8125046 B 429.1387825 -0.68
0.4956
CI CI3297 -337.3309699 B 295.7373148 -1.14
0.2549
CI CI3318 -304.6532979 B 429.1387825 -0.71
0.4783
CI CI3327 -149.5784968 B 286.4511670 -0.52
0.6019
CI CI3332 12.7546216 B 429.1387825 0.03
0.9763
CI CI3353 -237.8931342 B 296.4311958 -0.80
0.4229
CI CI3358 -45.7840100 B 296.4311958 -0.15
0.8774
CI CI3397 -435.2179895 B 286.4511670 -1.52
0.1297
CI CI3399 -881.4329535 B 429.1387825 -2.05
0.0409
CI CI3404 -444.0878790 B 296.4311958 -1.50 0.1352
CI CI3411 -429.6075402 B 286.4511670 -1.50
0.1347
CI CI3423 -202.8770546 B 296.4311958 -0.68
0.4943
CI CI3424 -81.0557310 B 286.4511670 -0.28
0.7774
CI CI3425 -504.8586894 B 286.4511670 -1.76
0.0790
CI FP1094 -231.8712486 B 347.5157316 -0.67
0.5051
CI FP2024 422.7024226 B 295.7373148 1.43
0.1540
CI FP2044 -170.6113760 B 286.4511670 -0.60
0.5519
CI FP2102 -230.1674082 B 359.1959950 -0.64
0.5222
CI FP2107 -381.4768045 B 359.1959950 -1.06
0.2891
CI FP2112 -109.2959996 B 285.5318421 -0.38
0.7022
CI FP2114 -586.3033210 B 285.5318421 -2.05
0.0409
CI FP2118 -534.7015999 B 285.5318421 -1.87
0.0621
CI FP2119 -236.5543681 B 285.5318421 -0.83
0.4081
CI N0010 -61.5452902 B 286.4511670 -0.21
0.8300
CI N2007 -289.5237181 B 317.8808187 -0.91
0.3631
CI N2010 -169.5133808 B 317.8808187 -0.53
0.5943
CI N2010B -757.4554914 B 337.4755132 -2.24
0.0255
CI N2010Y -839.6474827 B 337.4755132 -2.49
0.0134
CI N2014 -218.6600408 B 317.8808187 -0.69
0.4921
CI N305 163.3258906 B 317.8808187 0.51
0.6078
CI N320 72.8544734 B 317.8808187 0.23
0.8189
CI N323 -301.6254559 B 317.8808187 -0.95
0.3435
CI N325 -119.0291690 B 317.8808187 -0.37
0.7083
CI N9719 0.0000000 B .
. .
MYIELD(CI) CI 389 0.6790373 0.1094259 6.21
<.0001
MYIELD(CI) CI2522 1.0747337 0.1094259 9.82
<.0001
MYIELD(CI) CI2921 1.2735701 0.1094259 11.64
<.0001
MYIELD(CI) CI3096 1.0612916 0.1631526 6.50
<.0001
MYIELD(CI) CI3259 0.9272735 0.1164787 7.96
<.0001
MYIELD(CI) CI3270 0.9664390 0.1831084 5.28
<.0001
MYIELD(CI) CI3296 1.0137999 0.1831084 5.54
<.0001
MYIELD(CI) CI3297 1.0373615 0.1146171 9.05
<.0001
MYIELD(CI) CI3318 1.0312695 0.1831084 5.63
<.0001
MYIELD(CI) CI3327 0.9097946 0.1115983 8.15
<.0001
MYIELD(CI) CI3332 0.8070294 0.1831084 4.41
<.0001
MYIELD(CI) CI3353 1.0572017 0.1164787 9.08
<.0001
MYIELD(CI) CI3358 0.9449930 0.1164787 8.11
<.0001
MYIELD(CI) CI3397 1.1499679 0.1115983 10.30
<.0001
MYIELD(CI) CI3399 1.2998448 0.1831084
7.10 <.0001
MYIELD(CI) CI3404 1.0702471 0.1164787 9.19
<.0001
MYIELD(CI) CI3411 1.1313799 0.1115983 10.14
<.0001
MYIELD(CI) CI3423 1.0156819 0.1164787 8.72
<.0001
MYIELD(CI) CI3424 0.9681886 0.1115983 8.68
<.0001
MYIELD(CI) CI3425 1.1565583 0.1115983 10.36
<.0001
MYIELD(CI) FP1094 1.0046505 0.1501929 6.69
<.0001
MYIELD(CI) FP2024 0.7000287 0.1146171 6.11
<.0001
MYIELD(CI) FP2044 0.9228528 0.1115983 8.27
<.0001
MYIELD(CI) FP2102 0.9328311 0.1711090 5.45
<.0001
MYIELD(CI) FP2107 1.0846242 0.1711090 6.34
<.0001
MYIELD(CI) FP2112 0.9502622 0.1094259 8.68
<.0001
MYIELD(CI) FP2114 1.1132791 0.1094259 10.17
<.0001
MYIELD(CI) FP2118 1.0955636 0.1094259 10.01
<.0001
MYIELD(CI) FP2119 0.9747877 0.1094259 8.91
<.0001
MYIELD(CI) N0010 0.9216513 0.1115983 8.26
<.0001
MYIELD(CI) N2007 0.9835869 0.1417316 6.94
<.0001
MYIELD(CI) N2010 0.9218646 0.1417316 6.50
<.0001
MYIELD(CI) N2010B 1.3509072 0.1631526 8.28
<.0001
MYIELD(CI) N2010Y 1.3674357 0.1631526 8.38
<.0001
MYIELD(CI) N2014 0.9480153 0.1417316 6.69
<.0001
MYIELD(CI) N305 0.7522699 0.1417316 5.31
<.0001
MYIELD(CI) N320 0.8438111 0.1417316 5.95
<.0001
MYIELD(CI) N323 1.1199809 0.1417316 7.90
<.0001
MYIELD(CI) N325 0.9714982 0.1417316 6.85
<.0001
MYIELD(CI) N9719 0.8615456 0.1115983 7.72
<.0001
NOTE:
The X'X matrix has been found to be singular, and a generalized inverse was
used to solve
the normal equations. Terms whose estimates are followed by the
letter 'B' are not
uniquely estimable.
observed = mean + regr + dev
Mean S (Y)2 / n
Regr S(xy)2 / Sx2
Dev Y2 - mean - regr
or r = Sxy /[Sx2Sy2]
r2 tells us the relative amount of variation in common.
An example
| Age | Blood Pressure | x | y | x2 | y2 | xy |
| 35 | 114 | -20 | -27 | 400 | 729 | 540 |
| 45 | 124 | -10 | -17 | 100 | 289 | 170 |
| 55 | 143 | 0 | 2 | 0 | 4 | 0 |
| 65 | 158 | 10 | 17 | 100 | 289 | 170 |
| 75 | 166 | 20 | 25 | 400 | 625 | 500 |
| --- | --- | --- | --- | --- | --- | --- |
| 275 | 705 | 0 | 0 | 1000 | 1936 | 1380 |
| Age | Y | predicted | dev | dev squared |
| 35 | 114 | 113.4 | 0.6 | 0.36 |
| 45 | 124 | 127.2 | -3.2 | 10.24 |
| 55 | 143 | 141.0 | 2.0 | 4.00 |
| 65 | 158 | 154.8 | 3.2 | 10.24 |
| 75 | 166 | 168.6 | -2.6 | 6.76 |
| 0 | 31.60 |
S2y.x = 31.6 / 3 = 10.53 S2b = 10.53/1000 Sb = 0.102
H: ß = 0
t = 1.38 / 0.102 = 13.5 *
| Source | df | SS | MS | F |
| Total | 5 | 101341 | ||
| Mean | 1 | 99405 | ||
| Corr Tot | 4 | 1936 | ||
| Regress | 1 | 1904.4 | 1904.4 | 180.8 |
| Dev | 3 | 31.6 | 10.53 |
Intercept = µy - ßµx
Variance of intercept = Var µy + Var ßµx
= S²y.x(1/n) + µ²x S²y.x (1/x²)
= S²y.x(1/n + µ²x/x²) = S²
Test procedures in simple linear regression
| Hypothesis | Statistic | Equation |
| a = a0 | t | (a - a 0)/Sa0 |
| ß = ß0 | t | (ß - ß0)/Sß |
| a = a 0 and ß = ß0 | F | n(a - a0)² + 2nµx[(a - a0)(ß - ß0) + (ß - ß0)Sx²] / (2S²y.x) |
An example 2 groups
| Group 1 | ......... | Group 2 | ||
| X | Y | X | Y | |
| 30 | 165 | 24 | 180 | |
| 27 | 170 | 31 | 169 | |
| 20 | 130 | 20 | 171 | |
| 21 | 156 | 26 | 161 | |
| 33 | 167 | 20 | 180 | |
| 29 | 151 | 25 | 170 |
Group 2 ß2 = -0.852 SS dev = 200.95
H: ß1 = ß2
t = [1.995 - (-0.852)]/{[(566.83+200.95)/ (5+5-4)][1/133.33+1/85.33]}½
= 2.447
t6,.05 = 2.447
t6,.20 = 1.44
Rerun of example with matrix approach - like a computer package would solve a linear regression
problem
Y = µ + (X-µx)b
114 = 1µ + (35-55)b = -20
124 = 1µ + (45-55)b = -10
143 = 1µ + (55-55)b = 0
158 = 1µ + (65-55)b = 10
166 = 1µ + (75-55)b = 20
5 0 µµ µx
X'X =
0 1000 xµ xx
1/5 0
(X'X)-1 =
0 1/1000