scvi_criticism.PPC#

class scvi_criticism.PPC(adata, models_dict, count_layer_key=None, n_samples=10)#

Posterior predictive checks for comparing single-cell generative models

Parameters:
  • adata (AnnData) – AnnData object with raw counts.

  • models_dict (Dict[str, BaseModelClass]) – Dictionary of models to compare.

  • count_layer_key (Optional[str] (default: None)) – Key in adata.layers to use as raw counts, if None, use adata.X.

  • n_samples (int (default: 10)) – Number of posterior predictive samples to generate

Methods table#

calibration_error([confidence_intervals])

Calibration error for each observed count.

coefficient_of_variation([dim])

Calculate the coefficient of variation (CV) for each model and the raw counts.

differential_expression(de_groupby[, ...])

Compute differential expression (DE) metrics.

run(metrics_to_run)

Run the metrics.

zero_fraction()

Fraction of zeros in raw counts for a specific gene

Methods#

PPC.calibration_error(confidence_intervals=None)#

Calibration error for each observed count.

For a series of credible intervals of the samples, the fraction of observed counts that fall within the credible interval is computed. The calibration error is then the squared difference between the observed fraction and the true interval width.

For this metric, lower is better.

Parameters:

confidence_intervals (Optional[List[float]] (default: None)) – List of confidence intervals to compute calibration error for. E.g., [0.01, 0.02, 0.98, 0.99]

Return type:

None

Notes

This does not work on sparse data and can cause large memory usage.

PPC.coefficient_of_variation(dim='cells')#

Calculate the coefficient of variation (CV) for each model and the raw counts.

The CV is computed over the cells or features dimension per sample. The mean CV is then computed over all samples.

Parameters:

dim (Literal['cells', 'features'] (default: 'cells')) – Dimension to compute CV over.

Return type:

None

PPC.differential_expression(de_groupby, de_method='t-test', n_samples=1, cell_scale_factor=10000.0, p_val_thresh=0.001, n_top_genes_fallback=100)#

Compute differential expression (DE) metrics.

If n_samples > 1, all metrics are averaged over a posterior predictive dataset.

Parameters:
  • de_groupby (str) – The column name in adata_obs_raw that contains the groupby information.

  • de_method (str (default: 't-test')) – The DE method to use. See rank_genes_groups() for more details.

  • n_samples (int (default: 1)) – The number of posterior predictive samples to use for the DE analysis.

  • cell_scale_factor (float (default: 10000.0)) – The cell scale factor to use for normalization before DE.

  • p_val_thresh (float (default: 0.001)) – The p-value threshold to use for the DE analysis.

  • n_top_genes_fallback (int (default: 100)) – The number of top genes to use for the DE analysis if the number of genes with a p-value < p_val_thresh is zero.

PPC.run(metrics_to_run)#

Run the metrics.

PPC.zero_fraction()#

Fraction of zeros in raw counts for a specific gene

Return type:

None