Question

Given a linear model, mod <- lm(mpg ~ wt, data = mtcars), how would you extract the estimate of wt?

Source: Advanced R by Hadley Wickham

  1. summary(mod)$coefficients['wt', 'Estimate']
  2. summary(mod)$coefficients[2,1]
  3. Both are correct

Answer

If in case, confusion arises about how to extract variables from a model (not just linear). Try this approach:

  1. see what all the model is returning and where the respective parameter is stored, using str()
  2. see the type of element where the respective parameter is stored like list, dataframe, matrix etc.
  3. Once you know that, it just becomes a question of accessing elements from list, dataframe etc.

Let’s first create the model and look at how the summary of mod looks like.

mod = lm(mpg ~ wt, data = mtcars)
results = summary(mod)
results
## 
## Call:
## lm(formula = mpg ~ wt, data = mtcars)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -4.5432 -2.3647 -0.1252  1.4096  6.8727 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  37.2851     1.8776  19.858  < 2e-16 ***
## wt           -5.3445     0.5591  -9.559 1.29e-10 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.046 on 30 degrees of freedom
## Multiple R-squared:  0.7528, Adjusted R-squared:  0.7446 
## F-statistic: 91.38 on 1 and 30 DF,  p-value: 1.294e-10

Using str() I can see the structure of results and also its type.

str(results)
## List of 11
##  $ call         : language lm(formula = mpg ~ wt, data = mtcars)
##  $ terms        :Classes 'terms', 'formula'  language mpg ~ wt
##   .. ..- attr(*, "variables")= language list(mpg, wt)
##   .. ..- attr(*, "factors")= int [1:2, 1] 0 1
##   .. .. ..- attr(*, "dimnames")=List of 2
##   .. .. .. ..$ : chr [1:2] "mpg" "wt"
##   .. .. .. ..$ : chr "wt"
##   .. ..- attr(*, "term.labels")= chr "wt"
##   .. ..- attr(*, "order")= int 1
##   .. ..- attr(*, "intercept")= int 1
##   .. ..- attr(*, "response")= int 1
##   .. ..- attr(*, ".Environment")=<environment: R_GlobalEnv> 
##   .. ..- attr(*, "predvars")= language list(mpg, wt)
##   .. ..- attr(*, "dataClasses")= Named chr [1:2] "numeric" "numeric"
##   .. .. ..- attr(*, "names")= chr [1:2] "mpg" "wt"
##  $ residuals    : Named num [1:32] -2.28 -0.92 -2.09 1.3 -0.2 ...
##   ..- attr(*, "names")= chr [1:32] "Mazda RX4" "Mazda RX4 Wag" "Datsun 710" "Hornet 4 Drive" ...
##  $ coefficients : num [1:2, 1:4] 37.285 -5.344 1.878 0.559 19.858 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : chr [1:2] "(Intercept)" "wt"
##   .. ..$ : chr [1:4] "Estimate" "Std. Error" "t value" "Pr(>|t|)"
##  $ aliased      : Named logi [1:2] FALSE FALSE
##   ..- attr(*, "names")= chr [1:2] "(Intercept)" "wt"
##  $ sigma        : num 3.05
##  $ df           : int [1:3] 2 30 2
##  $ r.squared    : num 0.753
##  $ adj.r.squared: num 0.745
##  $ fstatistic   : Named num [1:3] 91.4 1 30
##   ..- attr(*, "names")= chr [1:3] "value" "numdf" "dendf"
##  $ cov.unscaled : num [1:2, 1:2] 0.38 -0.1084 -0.1084 0.0337
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : chr [1:2] "(Intercept)" "wt"
##   .. ..$ : chr [1:2] "(Intercept)" "wt"
##  - attr(*, "class")= chr "summary.lm"

As I can see, results is a list (or in other words, summary() returns a list). You can also use typeof() and names() to check results’s type and what all it contains.

typeof(results)
## [1] "list"
names(results)
##  [1] "call"          "terms"         "residuals"     "coefficients" 
##  [5] "aliased"       "sigma"         "df"            "r.squared"    
##  [9] "adj.r.squared" "fstatistic"    "cov.unscaled"

So results is a list and know that coefficients of wt is stored in coefficients. Now it’s just become the question of indexing a list and there are multiple ways to do it.

coeff_wt = results[['coefficients']]
coeff_wt
##              Estimate Std. Error   t value     Pr(>|t|)
## (Intercept) 37.285126   1.877627 19.857575 8.241799e-19
## wt          -5.344472   0.559101 -9.559044 1.293959e-10
coeff_wt = results$coefficients
coeff_wt
##              Estimate Std. Error   t value     Pr(>|t|)
## (Intercept) 37.285126   1.877627 19.857575 8.241799e-19
## wt          -5.344472   0.559101 -9.559044 1.293959e-10

Now, need to extract estimate of wt, will start by checking the object class of coeff_wt.

class(coeff_wt)
## [1] "matrix"

It’s a matrix and there are multiple ways to index matrix. 1. Using integer indexing:

estimate = coeff_wt[2,1]
estimate
## [1] -5.344472
  1. Using row names and column names
estimate = coeff_wt['wt', 'Estimate']
estimate
## [1] -5.344472

Summarizing the whole operation in single command

mod = lm(mpg ~ wt, data = mtcars)
summary(mod)$coefficients['wt', 'Estimate']
## [1] -5.344472

Edit

As it happens, there is a better way to index. Thanks to people on twitter. You can skip, using summary(). mod is a list, to index list I can use $, coefficients has named attribute wt, which you can use to get wt coefficient.

mod = lm(mpg ~ wt, data = mtcars)
mod$coefficients[['wt']]
## [1] -5.344472
#from tidyverse
coef(mod)[['wt']]
## [1] -5.344472

Thanks for reading. If you like the question, how about some love and coffee: Buy me a coffee