This function loads multiple NULISAseq data files, processes them, and optionally merges them into a single dataset. It provides extensive control over quality control parameters and data processing options for NULISAseq data.
importNULISAseq(
files,
plateName = NULL,
return_type = c("all", "run", "merged"),
IC = NULL,
IPC = NULL,
SC = NULL,
NC = NULL,
Bridge = NULL,
Calibrator = NULL,
sample_group_covar = "SAMPLE_MATRIX",
excludeSamples = NULL,
excludeTargets = NULL,
include_qc = TRUE,
verbose = TRUE,
...
)Character vector of path and name of the NULISAseq data xml files.
Optional character vector of names for each file/run/plate. If NULL,
will use AUTO_PLATE variable within xml files or generate default names (Plate_01, Plate_02, etc)
if there are duplicate or missing AUTO_PLATE names.
Length must match `files`.
Character specifying output format: "all" (list with both individual runs and merged data), "run" (individual runs only), or "merged" (merged data only) (Default: "all").
Default is `NULL`. Optional character string giving internal control target name that will override pre-defined target type for all plates.
Default is `NULL`. Override default Inter-plate Control sample specification. Optional vector (applied to all plates) or named list with `plateName` names and vectors of character string(s) that match the IPC sample names or substrings of sample names. These will override pre-defined sample type and use these samples for inter-plate control normalization. (e.g., `list("Plate_01" = c("Plate01_IPC_sampleName_1", "Plate01_IPC_sampleName_2", "Plate01_IPC_sampleName_3"), "Plate_02" = c("Plate_02_IPC_sampleName_1", "Plate_02_IPC_sampleName_2", "Plate_02_IPC_sampleName_3"))`).
Default is `NULL`. Override default Sample Control sample specification. Same format as `IPC`.
Default is `NULL`. Override default Negative Control sample specification. Same format as `IPC`.
Default is `NULL`. Override default Bridge control sample specification. Same format as `IPC`.
Default is `NULL`. Override default calibrator sample specification. Same format as `IPC`.
Sample group covariate. Optional column name in the samples data matrix (from Barcode B file) for each plate that represents subgroups for which detectability and quantifiability (if AQ available) will be calculated separately in addition to overall detectability and overall quantifiability. Default is `SAMPLE_MATRIX`. Function will first check to be sure that the variable is present in the column names of the samples matrix. Can be set to NULL to not use this feature.
Samples to exclude from analysis. Can be: NULL (no exclusions), a vector of sample names (applied to all plates), or a named list (e.g., `list("Plate_01" = c("Sample1", "Sample2"), "Plate_02" = c("Sample3", "Sample4"))`).
Targets to exclude from analysis. Can be: NULL (no exclusions), a vector of target names (applied to all plates), or a named list (e.g., `list("Plate_01" = c("Target1"), "Plate_02" = c("Target2", "Target3"))`).
Logical indicating whether to include QC (Quality Control) columns in the long format output (`Data_NPQ_long` and `Data_AQ_long`). When `TRUE` (default), Sample QC and Target QC metrics are added to the long format data. Target QC columns are automatically filtered based on the data mode: RQ data (`Data_NPQ_long`) excludes AQ-specific metrics like concentration accuracy and CV calculated on AQ values, while AQ data (`Data_AQ_long`) includes all Target QC metrics. Set to `FALSE` to exclude all QC information and reduce output size (Default: TRUE).
Logical indicating whether to display progress messages (Default: TRUE).
Additional arguments passed to loadNULISAseq.
Depending on `return_type`:
"all"List with two components:
List of individual runs, each containing an object with the following structure:
plateID: Character vector of plate identifier
ExecutionDetails: Run metadata including command line, execution time, instrument details, lot information
RunSummary: Data frame of read statistics for the run (total reads, parseable reads, matches, etc.)
targets: Data frame of target metadata (target names, UniProt IDs, QC flags, etc.)
samples: Data frame of sample metadata (sample names, matrix type, sample plate well positions, etc.)
Data_raw: Raw count data (matrix, targets × samples)
attributes: List of quality attributes including QCS and SN matrices
IC, IPC, SC, NC, Bridge: Character vector of control target or sample names
IC_normed: Internal control normalized data (matrix, targets × samples)
normed_untransformedReverse, normed: Normalized data in multiple formats (matrix, targets × samples), before (`normed_untransformedReverse`) and after (`normed`) reverse-curve transformation is applied to any relevant targets
AQ: Absolute quantification data if available, including:
Data_AQ_aM: Absolute quantification in aM units (matrix, targets × samples)
Data_AQ: Absolute quantification in pg/mL units (matrix, targets × samples)
targetAQ_param: Data frame of curve parameters per target
withinDR: Logical matrix indicating targets within dynamic range (targets × samples)
NPQ: NULISA Protein Quantification (NPQ) data (log2 scale) (matrix, targets × samples)
lod: Limit of detection information including:
LOD: LOD values per target, on the unlogged normalized count scale
aboveLOD: Logical matrix indicating detection above LOD (targets × samples)
blank_outlier_table: NC outlier analysis results
LODNPQ: LOD in NPQ units
LOD_pgmL, LOD_aM: LOD in different units if AQ data available
detectability: List of group-specific and overall detectability
qcTarget, qcSample, qcPlate: Quality control flags for targets, samples, and plates
quantifiability: List of quantifiability metrics by sample group and overall, if AQ data available
qcSamplebyTarget: Sample-by-target QC matrices (read threshold, LOD, dynamic range)
AbsAssay, advancedQC: Assay type and advanced target QC flags
Merged dataset from all runs containing:
plateID: Character vector of plate identifiers
fileNames: Character vector of input file names
covariateNames: Character vector of sample covariate names
ExecutionDetails: Per-plate metadata (command line, run time, instrument, assay, lot info)
RunSummary: Data frame of read statistics (total reads, parseable, matches, etc.)
IC: Identifier of internal control
targets: Data frame of target metadata including UniProt IDs, QC flags, LOD/ULOQ/LLOQ, detectability
samples: Data frame of sample metadata including names, well positions, sample matrices, other covariates
qcSample: Data frame of sample-level QC flags (e.g., IC_Median) with values and status
qcTarget: Data frame of target-level QC flags (e.g., concentration accuracy, thresholds)
qcPlate: Data frame of plate-level QC flags (e.g., SC CV, WARN targets, thresholds)
aqParams: Data frame of curve parameters and quantification metrics
(LLOQ, ULOQ, LOD) (If AQ data available)
inconsistent_targets: Placeholder for targets inconsistent across plates/runs (NULL if none)
detectability: Data frame of detectability by target and sample matrix
quantifiability: Data frame of quantifiability by target and sample matrix
Data_raw, Data_rawlog2: Raw counts and log2-transformed counts (matrix, targets × samples)
Data_IC, Data_IClog2: Internal control–normalized data (linear and log2)
Data_Reverse, Data_Reverselog2: Reverse-transformed IC-IPC normalized data (linear and log2)
Data_AQ_aM, Data_AQlog2_aM: Absolute quantitation in attomolar units (linear and log2),
if AQ data available
Data_AQ_pgmL, Data_AQlog2_pgmL: Absolute quantitation in pg/mL units (linear and log2),
if AQ data available
aboveLOD: Logical matrix indicating values above LOD (targets × samples)
unit: Character string of concentration units (e.g., "pg/mL")
Data_NPQ: Matrix of NULISA Protein Quantification (NPQ) values (log2)
Data_NPQ_long: Data frame (long format) of NPQ data with sample and target annotations
Data_AQ_long: Data frame (long format) of absolute quantification data with concentrations and LOD/LLOQ/ULOQ,
if AQ data available
"run"List of individual runs (as described in the 'runs' component above)
"merged"Merged dataset from all runs (as described in the 'merged' component above)
loadNULISAseq for loading individual NULISAseq runs