Skip to contents

Feature selection via iterative rounds of permuted based feature importance

Usage

feature_selection(
  fit_function = NULL,
  data = NULL,
  test = data,
  response = NULL,
  loss_function = NULL,
  stat = stats::median,
  iterations = 1,
  sample_size = NULL,
  sample_frac = NULL,
  predict_function = NULL,
  parallel = FALSE,
  ...
)

Arguments

fit_function

A function with formula and data arguments to fit the desired model.

data

A data to calculate the loss_function.

test

A testing data frame to evaluate the loss function. By default is the data argument.

response

Name of the variable response.

loss_function

The loss function to evaluate, Must be a function with 2 arguments: actual and predicted values. Loss function gives a smaller value if the model have better performance of the model.

stat

Default median. A summary function to compare the values of the loss of a variable vs full model. If the stat value of the one variable is smaller than the value of the loss function full model, then the variable is removed in that round.

iterations

Number of iterations.

sample_size

Sample size.

sample_frac

Proportion to sample in each iteration.

predict_function

Predict function, usually is a function(model, newdata) which returns a vector (no data frame).

parallel

A logical value indicating if the process should be using furrr::future_pmap_dbl or purrr::pmap_dbl.

...

Specific arguments for fit_function.