Generate Heatmap for NULISAseq Data with ComplexHeatmap

Draws a heatmap for a set of samples and targets based on sample and target metadata. Uses ComplexHeatmap for generating plots. Colors of covariates are automatically generated from palettes in RColorBrewer, or can be user-specified. Supports both standard and transposed orientations, row and column annotations, and various clustering options.

generate_heatmap(
  data,
  sampleInfo,
  targetInfo = NULL,
  sampleName_var,
  targetName_var = "Target",
  sample_subset = NULL,
  target_subset = NULL,
  annotate_sample_by = NULL,
  annotate_target_by = NULL,
  column_split_by = NULL,
  row_split_by = NULL,
  row_fontsize = 4,
  col_fontsize = 4,
  column_split_num = NULL,
  name = "Z-Score",
  row_title_rot = 0,
  clustering_method_columns = "ward.D2",
  row_split = NULL,
  cluster_rows = TRUE,
  cluster_column_slices = FALSE,
  cluster_row_slices = TRUE,
  transpose = FALSE,
  numeric_color_palette = c("blue", "red", "green", "purple", "orange", "brown"),
  sample_colors = NULL,
  output_dir = NULL,
  plot_name = NULL,
  plot_title = NULL,
  plot_width = 14,
  plot_height = 7,
  ...
)

Arguments

data: A matrix with targets in rows, samples in columns. Row names should be the target names, and column names are the sample names. It is assumed that data has already been transformed using log2(x + 1) for each NULISAseq normalized count value x, i.e. NPQ.
sampleInfo: A data frame with sample metadata. Rows are samples, columns are sample metadata variables. Heatmap will only show the samples in sample_subset from sampleInfo.
targetInfo: A data frame with target metadata. Rows are targets, columns are target metadata variables. Required if annotate_target_by or row_split_by are specified; defaults to NULL.
sampleName_var: Character string specifying the name of the column in sampleInfo that matches the column names of data. This variable will be used for subsetting samples from sampleInfo.
targetName_var: Character string specifying the name of the column in targetInfo that matches the row names of data; defaults to "Target".
sample_subset: Vector of sample names for selected samples to show in the heatmap, should match the existing column names of data; defaults to NULL (all samples).
target_subset: Vector of target names for selected targets to show in the heatmap, should match the existing row names of data; defaults to NULL (all targets).
annotate_sample_by: Character vector of column names from sampleInfo that will be used for sample annotations (shown as colored bars); defaults to NULL.
annotate_target_by: Character vector of column names from targetInfo that will be used for target annotations (shown as colored bars on the left or top depending on orientation); defaults to NULL.
column_split_by: Character string specifying the name of the column from sampleInfo that will be used for supervised clustering of columns (samples). This creates separate column slices in the heatmap; defaults to NULL.
row_split_by: Character string specifying the name of the column from targetInfo that will be used for supervised clustering of rows (targets). This creates separate row slices in the heatmap; defaults to NULL.
row_fontsize: Numeric value for the text size of the row labels in heatmap; defaults to 4.
col_fontsize: Numeric value for the text size of the column labels in heatmap; defaults to 4.
column_split_num: Integer specifying the number of slices that the columns are split into via unsupervised clustering; defaults to NULL. Ignored if column_split_by is specified.
name: Character string used as the title of the heatmap legend; defaults to "Z-Score".
row_title_rot: Numeric value for rotation of row titles in degrees. Only 0, 90, 270 are allowed; defaults to 0.
clustering_method_columns: Character string specifying the method to perform hierarchical clustering, passed to hclust; defaults to "ward.D2".
row_split: Integer specifying the number of slices that the rows are split into via unsupervised clustering; defaults to NULL. Ignored if row_split_by is specified. When NULL and targetInfo is not provided, no row splitting is performed.
cluster_rows: Logical indicating whether to cluster rows (targets). If NULL (default), clustering is enabled when targetInfo is provided, and disabled when targetInfo is NULL.
cluster_column_slices: Logical indicating whether to perform clustering on column slices if columns are split; defaults to FALSE.
cluster_row_slices: Logical indicating whether to perform clustering on row slices if rows are split; defaults to TRUE.
transpose: Logical indicating whether to transpose the heatmap (samples in rows, targets in columns); defaults to FALSE.
numeric_color_palette: Character vector of colors to use for numeric annotations. Each color will be used with a white-to-color gradient for continuous variables; defaults to c("blue", "red", "green", "purple", "orange", "brown").
sample_colors: Named list of custom colors for categorical sample annotations. List names should match column names in annotate_sample_by. Each element should be a named vector where names are category levels and values are color codes; defaults to NULL.
output_dir: Character string specifying the directory path to save the plot. If NULL, the plot is not saved; defaults to NULL.
plot_name: Character string specifying the filename for the saved plot, including file extension (.pdf, .png, .jpg, or .svg). Required if output_dir is specified; defaults to NULL.
plot_title: Character string for the title of the heatmap; defaults to NULL.
plot_width: Numeric value for the width of the saved plot in inches; defaults to 14.
plot_height: Numeric value for the height of the saved plot in inches; defaults to 7.
...: Additional arguments passed to ComplexHeatmap::Heatmap function.

Value

A list containing:

targets_used: Character vector of target names used in the heatmap after filtering.
heatmap: The ComplexHeatmap object.

Details

The function performs the following steps:

Filters data to specified samples and targets
Removes targets with all zero values
Scales data by row (Z-score transformation)
Removes rows with NA, NaN, or Inf values after scaling
Optionally transposes the matrix
Generates or uses custom colors for annotations
Creates the heatmap with specified annotations and clustering
Optionally saves to file

Custom Colors

To specify custom colors for categorical sample annotations, use the sample_colors parameter:


my_colors <- list(
  Group = c("Control" = "#FF0000", "Treatment" = "#0000FF"),
  Batch = c("Batch1" = "#FFA500", "Batch2" = "#800080")
)

Examples

if (FALSE) { # \dontrun{
# Basic heatmap with sample annotations (no row clustering by default)
result <- generate_heatmap(
  data = Data_NPQ,
  sampleInfo = sample_metadata,
  sampleName_var = "SampleName",
  annotate_sample_by = c("Group", "Batch")
)

# Heatmap with row clustering enabled
result <- generate_heatmap(
  data = Data_NPQ,
  sampleInfo = sample_metadata,
  sampleName_var = "SampleName",
  annotate_sample_by = c("Group", "Batch"),
  cluster_rows = TRUE,
  row_split = 3
)

# Heatmap with target annotations and custom colors
custom_colors <- list(
  Group = c("Control" = "blue", "Treatment" = "red")
)

result <- generate_heatmap(
  data = Data_NPQ,
  sampleInfo = sample_metadata,
  targetInfo = target_metadata,
  sampleName_var = "SampleName",
  targetName_var = "Target",
  annotate_sample_by = c("Group", "Batch"),
  annotate_target_by = c("Pathway", "Domain"),
  sample_colors = custom_colors,
  row_split_by = "Pathway",
  column_split_by = "Group"
)

# Save heatmap to file (supports PDF, PNG, JPG, SVG)
result <- generate_heatmap(
  data = expr_matrix,
  sampleInfo = sample_metadata,
  sampleName_var = "Sample_ID",
  annotate_sample_by = c("Group"),
  output_dir = "output/figures",
  plot_name = "expression_heatmap.svg",
  plot_title = "Gene Expression Heatmap",
  plot_width = 12,
  plot_height = 8
)
} # }