> out=lm(bwt~age+lwt+factor(race)+smoke+ptl+ht+ui, data=birthwt)
> summary(out)
Call:
lm(formula = bwt ~ age + lwt + factor(race) + smoke + ptl + ht +
ui, data = birthwt)
Residuals:
Min 1Q Median 3Q Max
-1838.7 -454.5 57.6 465.1 1711.0
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2934.132 311.450 9.421 < 2e-16 ***
age -4.093 9.440 -0.434 0.665091
lwt 4.301 1.722 2.497 0.013416 *
factor(race)2 -488.196 149.604 -3.263 0.001318 **
factor(race)3 -353.334 114.319 -3.091 0.002314 **
smoke -351.314 106.180 -3.309 0.001132 **
ptl -47.423 101.663 -0.466 0.641443
ht -586.836 200.841 -2.922 0.003925 **
ui -514.937 138.483 -3.718 0.000268 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 648.7 on 180 degrees of freedom
Multiple R-squared: 0.2424, Adjusted R-squared: 0.2087
F-statistic: 7.197 on 8 and 180 DF, p-value: 2.908e-08
> anova(out)
Analysis of Variance Table
Response: bwt
Df Sum Sq Mean Sq F value Pr(>F)
age 1 815483 815483 1.9380 0.1656022
lwt 1 2967339 2967339 7.0519 0.0086284 **
factor(race) 2 4750632 2375316 5.6450 0.0041901 **
smoke 1 6291918 6291918 14.9529 0.0001538 ***
ptl 1 732501 732501 1.7408 0.1887130
ht 1 2852764 2852764 6.7796 0.0099900 **
ui 1 5817995 5817995 13.8266 0.0002676 ***
Residuals 180 75741025 420783
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’
intercept가 없어서 그런가... 물론 둘 다 age 와 ptl 두 변수가 p-value가 높긴 한데
수치상으로 두 그룹이 매치가 안 됩니다...
다중회귀분석에서 summary와 anova는 각각 다른 의미의 p-value를 표시해주나요?
그리고 summary 에서는 factor(race) 를 왜 2,3 으로 쪼갠건가요?
더미변수로 바꾼 것 같긴 한데 그럼 anova는 왜 그냥 한꺼번에 했을까요...
혼자서 나름 연구한답시고 summary에서는 factor(race)2,3 를 더미변수로 쪼개서 해서 결과가 조금 다른 것 뿐인가 하고
factor 변수를 빼고 한 번 비교를 해 봤는데
> summary(lm(bwt~lwt+smoke+ht+ui, data=birthwt))
Call:
lm(formula = bwt ~ lwt + smoke + ht + ui, data = birthwt)
Residuals:
Min 1Q Median 3Q Max
-1665.02 -452.62 11.16 473.64 1858.65
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2577.096 226.794 11.363 < 2e-16 ***
lwt 4.506 1.660 2.714 0.007270 **
smoke -242.113 100.127 -2.418 0.016579 *
ht -649.098 206.327 -3.146 0.001931 **
ui -549.878 139.424 -3.944 0.000114 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 669.9 on 184 degrees of freedom
Multiple R-squared: 0.1741, Adjusted R-squared: 0.1561
F-statistic: 9.695 on 4 and 184 DF, p-value: 3.886e-07
> anova(lm(bwt~lwt+smoke+ht+ui, data=birthwt))
Analysis of Variance Table
Response: bwt
Df Sum Sq Mean Sq F value Pr(>F)
lwt 1 3448639 3448639 7.6852 0.0061402 **
smoke 1 3326720 3326720 7.4135 0.0070964 **
ht 1 3646795 3646795 8.1268 0.0048597 **
ui 1 6979875 6979875 15.5545 0.0001139 ***
Residuals 184 82567628 448737
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1
그래도 여전히 조금 차이가 있는 것 같고...
잘 모르겠네요