This function detects outliers in a numeric dataset using the Median Absolute Deviation (MAD) method.

outliers_mad(data, threshold = 2.5, maxRemoval = 0, returnVal = "indices")

Arguments

data

A numeric vector or data frame containing the data from which outliers should be detected.

threshold

A numeric value representing the threshold for identifying outliers. Default is 2.5.

maxRemoval

An integer specifying the maximum number of the worst outliers to remove. Default is 0 (no removal).

returnVal

A character string specifying the type of result to return. Options are "indices" (default), "outliers", or "keeps".

Value

Depending on the 'returnVal' parameter, the function returns: - "indices": A numeric vector of the indices of the detected outliers. - "outliers": A numeric vector of the values of the detected outliers. - "keeps": A numeric vector of the values of the data points that are not identified as outliers. - "replaceNA": A numeric vector of the values of the data points with outliers replaced with NA.

Details

The function calculates the MAD, identifies outliers based on the threshold, and can optionally remove a specified number of the worst outliers.

See also

'mad' function from the 'stats' package for calculating the MAD.

Examples

# Example usage
your_data <- c(10, 12, 13, 15, 9, 11, 100, 14, 16, 8, 7)
outlier_indices <- outliers_mad(your_data, threshold = 2, maxRemoval = 2, returnVal = "indices")
cat("Outlier indices: ", outlier_indices, "\n")
#> Outlier indices:  7 
removed_outliers <- outliers_mad(your_data, threshold = 2, maxRemoval = 2, returnVal = "keeps")
cat("Data after removing outliers: ", removed_outliers, "\n")
#> Data after removing outliers:  10 12 13 15 9 11 14 16 8 7