Import and Process Multiple NULISAseq Runs

This function loads multiple NULISAseq data files, processes them, and optionally merges them into a single dataset. It provides extensive control over quality control parameters and data processing options for NULISAseq data.

importNULISAseq(
  files,
  plateName = NULL,
  return_type = c("all", "run", "merged"),
  IC = NULL,
  IPC = NULL,
  SC = NULL,
  NC = NULL,
  Bridge = NULL,
  Calibrator = NULL,
  sample_group_covar = "SAMPLE_MATRIX",
  excludeSamples = NULL,
  excludeTargets = NULL,
  include_qc = TRUE,
  verbose = TRUE,
  ...
)

Arguments

files: Character vector of path and name of the NULISAseq data xml files.
plateName: Optional character vector of names for each file/run/plate. If NULL, will use AUTO_PLATE variable within xml files or generate default names (Plate_01, Plate_02, etc) if there are duplicate or missing AUTO_PLATE names. Length must match `files`.
return_type: Character specifying output format: "all" (list with both individual runs and merged data), "run" (individual runs only), or "merged" (merged data only) (Default: "all").
IC: Default is `NULL`. Optional character string giving internal control target name that will override pre-defined target type for all plates.
IPC: Default is `NULL`. Override default Inter-plate Control sample specification. Optional vector (applied to all plates) or named list with `plateName` names and vectors of character string(s) that match the IPC sample names or substrings of sample names. These will override pre-defined sample type and use these samples for inter-plate control normalization. (e.g., `list("Plate_01" = c("Plate01_IPC_sampleName_1", "Plate01_IPC_sampleName_2", "Plate01_IPC_sampleName_3"), "Plate_02" = c("Plate_02_IPC_sampleName_1", "Plate_02_IPC_sampleName_2", "Plate_02_IPC_sampleName_3"))`).
SC: Default is `NULL`. Override default Sample Control sample specification. Same format as `IPC`.
NC: Default is `NULL`. Override default Negative Control sample specification. Same format as `IPC`.
Bridge: Default is `NULL`. Override default Bridge control sample specification. Same format as `IPC`.
Calibrator: Default is `NULL`. Override default calibrator sample specification. Same format as `IPC`.
sample_group_covar: Sample group covariate. Optional column name in the samples data matrix (from Barcode B file) for each plate that represents subgroups for which detectability and quantifiability (if AQ available) will be calculated separately in addition to overall detectability and overall quantifiability. Default is `SAMPLE_MATRIX`. Function will first check to be sure that the variable is present in the column names of the samples matrix. Can be set to NULL to not use this feature.
excludeSamples: Samples to exclude from analysis. Can be: NULL (no exclusions), a vector of sample names (applied to all plates), or a named list (e.g., `list("Plate_01" = c("Sample1", "Sample2"), "Plate_02" = c("Sample3", "Sample4"))`).
excludeTargets: Targets to exclude from analysis. Can be: NULL (no exclusions), a vector of target names (applied to all plates), or a named list (e.g., `list("Plate_01" = c("Target1"), "Plate_02" = c("Target2", "Target3"))`).
include_qc: Logical indicating whether to include QC (Quality Control) columns in the long format output (`Data_NPQ_long` and `Data_AQ_long`). When `TRUE` (default), Sample QC and Target QC metrics are added to the long format data. Target QC columns are automatically filtered based on the data mode: RQ data (`Data_NPQ_long`) excludes AQ-specific metrics like concentration accuracy and CV calculated on AQ values, while AQ data (`Data_AQ_long`) includes all Target QC metrics. Set to `FALSE` to exclude all QC information and reduce output size (Default: TRUE).
verbose: Logical indicating whether to display progress messages (Default: TRUE).
...: Additional arguments passed to loadNULISAseq.

Value

Depending on `return_type`:

"all"List with two components:
runs
List of individual runs, each containing an object with the following structure:
- plateID: Character vector of plate identifier
- ExecutionDetails: Run metadata including command line, execution time, instrument details, lot information
- RunSummary: Data frame of read statistics for the run (total reads, parseable reads, matches, etc.)
- targets: Data frame of target metadata (target names, UniProt IDs, QC flags, etc.)
- samples: Data frame of sample metadata (sample names, matrix type, sample plate well positions, etc.)
- Data_raw: Raw count data (matrix, targets × samples)
- attributes: List of quality attributes including QCS and SN matrices
- IC, IPC, SC, NC, Bridge: Character vector of control target or sample names
- IC_normed: Internal control normalized data (matrix, targets × samples)
- normed_untransformedReverse, normed: Normalized data in multiple formats (matrix, targets × samples), before (`normed_untransformedReverse`) and after (`normed`) reverse-curve transformation is applied to any relevant targets
- AQ: Absolute quantification data if available, including:
  - Data_AQ_aM: Absolute quantification in aM units (matrix, targets × samples)
  - Data_AQ: Absolute quantification in pg/mL units (matrix, targets × samples)
  - targetAQ_param: Data frame of curve parameters per target
  - withinDR: Logical matrix indicating targets within dynamic range (targets × samples)
- NPQ: NULISA Protein Quantification (NPQ) data (log2 scale) (matrix, targets × samples)
- lod: Limit of detection information including:
  - LOD: LOD values per target, on the unlogged normalized count scale
  - aboveLOD: Logical matrix indicating detection above LOD (targets × samples)
  - blank_outlier_table: NC outlier analysis results
  - LODNPQ: LOD in NPQ units
  - LOD_pgmL, LOD_aM: LOD in different units if AQ data available
- detectability: List of group-specific and overall detectability
- qcTarget, qcSample, qcPlate: Quality control flags for targets, samples, and plates
- quantifiability: List of quantifiability metrics by sample group and overall, if AQ data available
- qcSamplebyTarget: Sample-by-target QC matrices (read threshold, LOD, dynamic range)
- AbsAssay, advancedQC: Assay type and advanced target QC flags
merged
Merged dataset from all runs containing:
- plateID: Character vector of plate identifiers
- fileNames: Character vector of input file names
- covariateNames: Character vector of sample covariate names
- ExecutionDetails: Per-plate metadata (command line, run time, instrument, assay, lot info)
- RunSummary: Data frame of read statistics (total reads, parseable, matches, etc.)
- IC: Identifier of internal control
- targets: Data frame of target metadata including UniProt IDs, QC flags, LOD/ULOQ/LLOQ, detectability
- samples: Data frame of sample metadata including names, well positions, sample matrices, other covariates
- qcSample: Data frame of sample-level QC flags (e.g., IC_Median) with values and status
- qcTarget: Data frame of target-level QC flags (e.g., concentration accuracy, thresholds)
- qcPlate: Data frame of plate-level QC flags (e.g., SC CV, WARN targets, thresholds)
- aqParams: Data frame of curve parameters and quantification metrics (LLOQ, ULOQ, LOD) (If AQ data available)
- inconsistent_targets: Placeholder for targets inconsistent across plates/runs (NULL if none)
- detectability: Data frame of detectability by target and sample matrix
- quantifiability: Data frame of quantifiability by target and sample matrix
- Data_raw: Raw counts (matrix, targets × samples)
- Data_AQ_aM, Data_AQlog2_aM: Absolute quantitation in attomolar units (linear and log2), if AQ data available
- Data_AQ_pgmL, Data_AQlog2_pgmL: Absolute quantitation in pg/mL units (linear and log2), if AQ data available
- aboveLOD: Logical matrix indicating values above LOD (targets × samples)
- unit: Character string of concentration units (e.g., "pg/mL")
- Data_NPQ: Matrix of NULISA Protein Quantification (NPQ) values (log2)
- Data_NPQ_long: Data frame (long format) of NPQ data with sample and target annotations
- Data_AQ_long: Data frame (long format) of absolute quantification data with concentrations and LOD/LLOQ/ULOQ, if AQ data available
"run"List of individual runs (as described in the 'runs' component above)
"merged"Merged dataset from all runs (as described in the 'merged' component above)

Examples

if (FALSE) { # \dontrun{
result <- importNULISAseq(
  files = c("run1.xml", "run2.xml"),
  plateName = c("Experiment1", "Experiment2"),
  excludeSamples = list("Experiment1" = c("SampleA"), "Experiment2" = c("SampleB")),
  return = "all"
)
} # }

Arguments

Value

See also

Examples