Given a linear model, mod <- lm(mpg ~ wt, data = mtcars)
, how would you extract the estimate of wt
?
Source: Advanced R by Hadley Wickham
summary(mod)$coefficients['wt', 'Estimate']
summary(mod)$coefficients[2,1]
If in case, confusion arises about how to extract variables from a model (not just linear). Try this approach:
Let’s first create the model and look at how the summary of mod
looks like.
mod = lm(mpg ~ wt, data = mtcars)
results = summary(mod)
results
##
## Call:
## lm(formula = mpg ~ wt, data = mtcars)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.5432 -2.3647 -0.1252 1.4096 6.8727
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 37.2851 1.8776 19.858 < 2e-16 ***
## wt -5.3445 0.5591 -9.559 1.29e-10 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.046 on 30 degrees of freedom
## Multiple R-squared: 0.7528, Adjusted R-squared: 0.7446
## F-statistic: 91.38 on 1 and 30 DF, p-value: 1.294e-10
Using str()
I can see the structure of results and also its type.
str(results)
## List of 11
## $ call : language lm(formula = mpg ~ wt, data = mtcars)
## $ terms :Classes 'terms', 'formula' language mpg ~ wt
## .. ..- attr(*, "variables")= language list(mpg, wt)
## .. ..- attr(*, "factors")= int [1:2, 1] 0 1
## .. .. ..- attr(*, "dimnames")=List of 2
## .. .. .. ..$ : chr [1:2] "mpg" "wt"
## .. .. .. ..$ : chr "wt"
## .. ..- attr(*, "term.labels")= chr "wt"
## .. ..- attr(*, "order")= int 1
## .. ..- attr(*, "intercept")= int 1
## .. ..- attr(*, "response")= int 1
## .. ..- attr(*, ".Environment")=<environment: R_GlobalEnv>
## .. ..- attr(*, "predvars")= language list(mpg, wt)
## .. ..- attr(*, "dataClasses")= Named chr [1:2] "numeric" "numeric"
## .. .. ..- attr(*, "names")= chr [1:2] "mpg" "wt"
## $ residuals : Named num [1:32] -2.28 -0.92 -2.09 1.3 -0.2 ...
## ..- attr(*, "names")= chr [1:32] "Mazda RX4" "Mazda RX4 Wag" "Datsun 710" "Hornet 4 Drive" ...
## $ coefficients : num [1:2, 1:4] 37.285 -5.344 1.878 0.559 19.858 ...
## ..- attr(*, "dimnames")=List of 2
## .. ..$ : chr [1:2] "(Intercept)" "wt"
## .. ..$ : chr [1:4] "Estimate" "Std. Error" "t value" "Pr(>|t|)"
## $ aliased : Named logi [1:2] FALSE FALSE
## ..- attr(*, "names")= chr [1:2] "(Intercept)" "wt"
## $ sigma : num 3.05
## $ df : int [1:3] 2 30 2
## $ r.squared : num 0.753
## $ adj.r.squared: num 0.745
## $ fstatistic : Named num [1:3] 91.4 1 30
## ..- attr(*, "names")= chr [1:3] "value" "numdf" "dendf"
## $ cov.unscaled : num [1:2, 1:2] 0.38 -0.1084 -0.1084 0.0337
## ..- attr(*, "dimnames")=List of 2
## .. ..$ : chr [1:2] "(Intercept)" "wt"
## .. ..$ : chr [1:2] "(Intercept)" "wt"
## - attr(*, "class")= chr "summary.lm"
As I can see, results
is a list (or in other words, summary()
returns a list). You can also use typeof()
and names()
to check results’s type and what all it contains.
typeof(results)
## [1] "list"
names(results)
## [1] "call" "terms" "residuals" "coefficients"
## [5] "aliased" "sigma" "df" "r.squared"
## [9] "adj.r.squared" "fstatistic" "cov.unscaled"
So results
is a list and know that coefficients of wt
is stored in coefficients
. Now it’s just become the question of indexing a list and there are multiple ways to do it.
coeff_wt = results[['coefficients']]
coeff_wt
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 37.285126 1.877627 19.857575 8.241799e-19
## wt -5.344472 0.559101 -9.559044 1.293959e-10
coeff_wt = results$coefficients
coeff_wt
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 37.285126 1.877627 19.857575 8.241799e-19
## wt -5.344472 0.559101 -9.559044 1.293959e-10
Now, need to extract estimate of wt
, will start by checking the object class of coeff_wt.
class(coeff_wt)
## [1] "matrix"
It’s a matrix and there are multiple ways to index matrix. 1. Using integer indexing:
estimate = coeff_wt[2,1]
estimate
## [1] -5.344472
estimate = coeff_wt['wt', 'Estimate']
estimate
## [1] -5.344472
Summarizing the whole operation in single command
mod = lm(mpg ~ wt, data = mtcars)
summary(mod)$coefficients['wt', 'Estimate']
## [1] -5.344472
As it happens, there is a better way to index. Thanks to people on twitter. You can skip, using summary()
. mod
is a list, to index list I can use $
, coefficients
has named attribute wt
, which you can use to get wt coefficient.
mod = lm(mpg ~ wt, data = mtcars)
mod$coefficients[['wt']]
## [1] -5.344472
#from tidyverse
coef(mod)[['wt']]
## [1] -5.344472
Thanks for reading. If you like the question, how about some love and coffee: Buy me a coffee