This function detects outliers in a numeric dataset using the Median Absolute Deviation (MAD) method.
outliers_mad(data, threshold = 2.5, maxRemoval = 0, returnVal = "indices")A numeric vector or data frame containing the data from which outliers should be detected.
A numeric value representing the threshold for identifying outliers. Default is 2.5.
An integer specifying the maximum number of the worst outliers to remove. Default is 0 (no removal).
A character string specifying the type of result to return. Options are "indices" (default), "outliers", or "keeps".
Depending on the 'returnVal' parameter, the function returns: - "indices": A numeric vector of the indices of the detected outliers. - "outliers": A numeric vector of the values of the detected outliers. - "keeps": A numeric vector of the values of the data points that are not identified as outliers. - "replaceNA": A numeric vector of the values of the data points with outliers replaced with NA.
The function calculates the MAD, identifies outliers based on the threshold, and can optionally remove a specified number of the worst outliers.
'mad' function from the 'stats' package for calculating the MAD.
# Example usage
your_data <- c(10, 12, 13, 15, 9, 11, 100, 14, 16, 8, 7)
outlier_indices <- outliers_mad(your_data, threshold = 2, maxRemoval = 2, returnVal = "indices")
cat("Outlier indices: ", outlier_indices, "\n")
#> Outlier indices: 7
removed_outliers <- outliers_mad(your_data, threshold = 2, maxRemoval = 2, returnVal = "keeps")
cat("Data after removing outliers: ", removed_outliers, "\n")
#> Data after removing outliers: 10 12 13 15 9 11 14 16 8 7