Title: | Fast Backward Elimination Based on Information Criterion |
---|---|
Description: | Performs backward elimination with similar syntax to the stepAIC() function from the 'MASS' package. A bounding algorithm is used to avoid fitting unnecessary models, making it much faster. |
Authors: | Jacob Seedorff [aut, cre] |
Maintainer: | Jacob Seedorff <[email protected]> |
License: | Apache License (>= 2) |
Version: | 1.0.1 |
Built: | 2025-03-25 03:33:50 UTC |
Source: | https://github.com/jacobseedorff21/fastbackward |
Performs backward elimination by AIC, backward elimination is performed with a bounding algorithm to make it faster.
fastbackward( object, scope, scale = 0, trace = 1, keep = NULL, steps = 1000, k = 2, ... )
fastbackward( object, scope, scale = 0, trace = 1, keep = NULL, steps = 1000, k = 2, ... )
object |
an object representing a model of an appropriate class. This is used as the initial model in the stepwise search. |
scope |
defines the range of models examined in the stepwise search. This should be missing or a single formula. If a formula is included, all of the components on the right-hand-side of the formula are always included in the model. If missing, then only the intercept (if included) is always included in the model. |
scale |
used in the definition of the AIC statistic for selecting the models, currently only for lm and aov models (see extractAIC for details). |
trace |
if positive, information is printed during the running of |
keep |
a filter function whose input is a fitted model object and the associated |
steps |
the maximum number of steps to be considered. The default is 1000 (essentially as many as required). It is typically used to stop the process early. |
k |
the multiple of the number of degrees of freedom used for the penalty.
Only |
... |
any additional arguments to extractAIC. |
The bounding algorithm allows us to avoid fitting models that cannot possibly provide an improvement in AIC. At a high-level, the algorithm basically works by identifying important predictors whose removal from the current model cannot possibly improve upon the current AIC.
Test statistics, p-values, and confidence intervals from the final selected model are not reliable due to the selection process. Thus, it is not recommended to use these quantities.
See more details at MASS::stepAIC.
The stepwise-selected model is returned, with up to two additional components.
There is an "anova
" component corresponding to the steps taken in the search,
as well as a "keep
" component if the keep=
argument was supplied in the call.
The "Resid. Dev
" column of the analysis of deviance table refers to a constant
minus twice the maximized log likelihood: it will be a deviance only in cases
where a saturated model is well-defined (thus excluding lm
, aov
and survreg
fits,
for example)
MASS::stepAIC, MASS::dropterm, and extractAIC
# Loading fastbackward library(fastbackward) # Using examples provided in MASS::stepAIC, but with fastbackward instead ## aov with quine dataset quine.hi <- aov(log(Days + 2.5) ~ .^4, MASS::quine) quine.nxt <- update(quine.hi, . ~ . - Eth:Sex:Age:Lrn) quine.stp <- fastbackward(quine.nxt, trace = FALSE) quine.stp$anova ## lm with cpus dataset cpus1 <- MASS::cpus for(v in names(MASS::cpus)[2:7]) cpus1[[v]] <- cut(MASS::cpus[[v]], unique(quantile(MASS::cpus[[v]])), include.lowest = TRUE) cpus0 <- cpus1[, 2:8] # excludes names, authors' predictions cpus.samp <- sample(1:209, 100) cpus.lm <- lm(log10(perf) ~ ., data = cpus1[cpus.samp,2:8]) cpus.lm2 <- fastbackward(cpus.lm, trace = FALSE) cpus.lm2$anova ## glm with bwt dataset example(birthwt, package = "MASS") birthwt.glm <- glm(low ~ ., family = binomial, data = bwt) birthwt.step <- fastbackward(birthwt.glm, trace = FALSE) birthwt.step$anova
# Loading fastbackward library(fastbackward) # Using examples provided in MASS::stepAIC, but with fastbackward instead ## aov with quine dataset quine.hi <- aov(log(Days + 2.5) ~ .^4, MASS::quine) quine.nxt <- update(quine.hi, . ~ . - Eth:Sex:Age:Lrn) quine.stp <- fastbackward(quine.nxt, trace = FALSE) quine.stp$anova ## lm with cpus dataset cpus1 <- MASS::cpus for(v in names(MASS::cpus)[2:7]) cpus1[[v]] <- cut(MASS::cpus[[v]], unique(quantile(MASS::cpus[[v]])), include.lowest = TRUE) cpus0 <- cpus1[, 2:8] # excludes names, authors' predictions cpus.samp <- sample(1:209, 100) cpus.lm <- lm(log10(perf) ~ ., data = cpus1[cpus.samp,2:8]) cpus.lm2 <- fastbackward(cpus.lm, trace = FALSE) cpus.lm2$anova ## glm with bwt dataset example(birthwt, package = "MASS") birthwt.glm <- glm(low ~ ., family = binomial, data = bwt) birthwt.step <- fastbackward(birthwt.glm, trace = FALSE) birthwt.step$anova