Draws a heatmap for a set of samples and targets based on sample and target metadata. Uses ComplexHeatmap for generating plots. Colors of covariates are automatically generated from palettes in RColorBrewer, or can be user-specified. Supports both standard and transposed orientations, row and column annotations, and various clustering options.

generate_heatmap(
  data,
  sampleInfo,
  targetInfo = NULL,
  sampleName_var,
  targetName_var = "Target",
  sample_subset = NULL,
  target_subset = NULL,
  annotate_sample_by = NULL,
  annotate_target_by = NULL,
  column_split_by = NULL,
  row_split_by = NULL,
  row_fontsize = 4,
  col_fontsize = 4,
  column_split_num = NULL,
  name = "Z-Score",
  row_title_rot = 0,
  clustering_method_columns = "ward.D2",
  row_split = NULL,
  cluster_rows = TRUE,
  cluster_column_slices = FALSE,
  cluster_row_slices = TRUE,
  transpose = FALSE,
  numeric_color_palette = c("blue", "red", "green", "purple", "orange", "brown"),
  sample_colors = NULL,
  output_dir = NULL,
  plot_name = NULL,
  plot_title = NULL,
  plot_width = 14,
  plot_height = 7,
  ...
)

Arguments

data

A matrix with targets in rows, samples in columns. Row names should be the target names, and column names are the sample names. It is assumed that data has already been transformed using log2(x + 1) for each NULISAseq normalized count value x, i.e. NPQ.

sampleInfo

A data frame with sample metadata. Rows are samples, columns are sample metadata variables. Heatmap will only show the samples in sample_subset from sampleInfo.

targetInfo

A data frame with target metadata. Rows are targets, columns are target metadata variables. Required if annotate_target_by or row_split_by are specified; defaults to NULL.

sampleName_var

Character string specifying the name of the column in sampleInfo that matches the column names of data. This variable will be used for subsetting samples from sampleInfo.

targetName_var

Character string specifying the name of the column in targetInfo that matches the row names of data; defaults to "Target".

sample_subset

Vector of sample names for selected samples to show in the heatmap, should match the existing column names of data; defaults to NULL (all samples).

target_subset

Vector of target names for selected targets to show in the heatmap, should match the existing row names of data; defaults to NULL (all targets).

annotate_sample_by

Character vector of column names from sampleInfo that will be used for sample annotations (shown as colored bars); defaults to NULL.

annotate_target_by

Character vector of column names from targetInfo that will be used for target annotations (shown as colored bars on the left or top depending on orientation); defaults to NULL.

column_split_by

Character string specifying the name of the column from sampleInfo that will be used for supervised clustering of columns (samples). This creates separate column slices in the heatmap; defaults to NULL.

row_split_by

Character string specifying the name of the column from targetInfo that will be used for supervised clustering of rows (targets). This creates separate row slices in the heatmap; defaults to NULL.

row_fontsize

Numeric value for the text size of the row labels in heatmap; defaults to 4.

col_fontsize

Numeric value for the text size of the column labels in heatmap; defaults to 4.

column_split_num

Integer specifying the number of slices that the columns are split into via unsupervised clustering; defaults to NULL. Ignored if column_split_by is specified.

name

Character string used as the title of the heatmap legend; defaults to "Z-Score".

row_title_rot

Numeric value for rotation of row titles in degrees. Only 0, 90, 270 are allowed; defaults to 0.

clustering_method_columns

Character string specifying the method to perform hierarchical clustering, passed to hclust; defaults to "ward.D2".

row_split

Integer specifying the number of slices that the rows are split into via unsupervised clustering; defaults to NULL. Ignored if row_split_by is specified. When NULL and targetInfo is not provided, no row splitting is performed.

cluster_rows

Logical indicating whether to cluster rows (targets). If NULL (default), clustering is enabled when targetInfo is provided, and disabled when targetInfo is NULL.

cluster_column_slices

Logical indicating whether to perform clustering on column slices if columns are split; defaults to FALSE.

cluster_row_slices

Logical indicating whether to perform clustering on row slices if rows are split; defaults to TRUE.

transpose

Logical indicating whether to transpose the heatmap (samples in rows, targets in columns); defaults to FALSE.

numeric_color_palette

Character vector of colors to use for numeric annotations. Each color will be used with a white-to-color gradient for continuous variables; defaults to c("blue", "red", "green", "purple", "orange", "brown").

sample_colors

Named list of custom colors for categorical sample annotations. List names should match column names in annotate_sample_by. Each element should be a named vector where names are category levels and values are color codes; defaults to NULL.

output_dir

Character string specifying the directory path to save the plot. If NULL, the plot is not saved; defaults to NULL.

plot_name

Character string specifying the filename for the saved plot, including file extension (.pdf, .png, .jpg, or .svg). Required if output_dir is specified; defaults to NULL.

plot_title

Character string for the title of the heatmap; defaults to NULL.

plot_width

Numeric value for the width of the saved plot in inches; defaults to 14.

plot_height

Numeric value for the height of the saved plot in inches; defaults to 7.

...

Additional arguments passed to ComplexHeatmap::Heatmap function.

Value

A list containing:

targets_used

Character vector of target names used in the heatmap after filtering.

heatmap

The ComplexHeatmap object.

Details

The function performs the following steps:

  1. Filters data to specified samples and targets

  2. Removes targets with all zero values

  3. Scales data by row (Z-score transformation)

  4. Removes rows with NA, NaN, or Inf values after scaling

  5. Optionally transposes the matrix

  6. Generates or uses custom colors for annotations

  7. Creates the heatmap with specified annotations and clustering

  8. Optionally saves to file

Custom Colors

To specify custom colors for categorical sample annotations, use the sample_colors parameter:


my_colors <- list(
  Group = c("Control" = "#FF0000", "Treatment" = "#0000FF"),
  Batch = c("Batch1" = "#FFA500", "Batch2" = "#800080")
)

Examples

if (FALSE) { # \dontrun{
# Basic heatmap with sample annotations (no row clustering by default)
result <- generate_heatmap(
  data = Data_NPQ,
  sampleInfo = sample_metadata,
  sampleName_var = "SampleName",
  annotate_sample_by = c("Group", "Batch")
)

# Heatmap with row clustering enabled
result <- generate_heatmap(
  data = Data_NPQ,
  sampleInfo = sample_metadata,
  sampleName_var = "SampleName",
  annotate_sample_by = c("Group", "Batch"),
  cluster_rows = TRUE,
  row_split = 3
)

# Heatmap with target annotations and custom colors
custom_colors <- list(
  Group = c("Control" = "blue", "Treatment" = "red")
)

result <- generate_heatmap(
  data = Data_NPQ,
  sampleInfo = sample_metadata,
  targetInfo = target_metadata,
  sampleName_var = "SampleName",
  targetName_var = "Target",
  annotate_sample_by = c("Group", "Batch"),
  annotate_target_by = c("Pathway", "Domain"),
  sample_colors = custom_colors,
  row_split_by = "Pathway",
  column_split_by = "Group"
)

# Save heatmap to file (supports PDF, PNG, JPG, SVG)
result <- generate_heatmap(
  data = expr_matrix,
  sampleInfo = sample_metadata,
  sampleName_var = "Sample_ID",
  annotate_sample_by = c("Group"),
  output_dir = "output/figures",
  plot_name = "expression_heatmap.svg",
  plot_title = "Gene Expression Heatmap",
  plot_width = 12,
  plot_height = 8
)
} # }