monkeybread.stat.shortest_distances

monkeybread.stat.shortest_distances(adata, groupby, group1, group2, n_perms=100, actual=None, threshold=None, basis='spatial')

Calculates an expected distribution of shortest distances via permutation of labels.

Calculation is the same as in monkeybread.calc.shortest_distances(). Label permutation excludes group1.

If actual and threshold are provided, a p-value relating the proportion of distances under threshold in the actual data compared to the expected data is produced.

Parameters
  • adata (AnnData) – Annotated data matrix.

  • groupby (str) – A categorical column in adata.obs to use for grouping.

  • group1 (Union[str, List[str]]) – One or more levels from adata.obs[groupby] to use as sources for shortest distance.

  • group2 (Union[str, List[str]]) – One or more levels from adata.obs[groupby] to use as destinations for shortest distance.

  • n_perms (Optional[int] (default: 100)) – The number of permutations to run. Defaults to 100.

  • actual (Optional[DataFrame] (default: None)) – The actual distribution of shortest distances, as calculated by monkeybread.calc.shortest_distances().

  • threshold (Optional[float] (default: None)) – A distance threshold to use for significance calculation, in coordinate units.

  • basis (Optional[str] (default: 'spatial')) – Coordinates in adata.obsm[X_{basis}] to use. Defaults to spatial.

Return type

Union[ndarray, Tuple[ndarray, float, float]]

Returns

If threshold is not provided, an array containing the expected distribution as described above. If threshold is provided, a length-3 tuple will be returned, where the first element is the array containing the expected distribution. The second element corresponds to the threshold, and the third element is the p-value calculated.