monkeybread.stat.cell_contact

monkeybread.stat.cell_contact(adata, groupby, group1, group2, actual_contact, contact_radius=None, perm_radius=100, n_perms=1000, split_groups=False, basis='spatial')

Calculates expected cell contact and p-value using a permutation test.

Test described in [F+22], consisting of position randomization within a radius. Instead of z-test, the p-value is derived from the number of permutations with higher count than observed in the data.

Parameters

adata (AnnData) – Annotated data matrix.
groupby (str) – A categorical column in adata.obs to classify groups.
group1 (Union[str, List[str]]) – Either one group or a list of groups from adata.obs[groupby].
group2 (Union[str, List[str]]) – Either one group or a list of groups from adata.obs[groupby].
actual_contact (Dict[str, Set[str]]) – The actual cell contacts, as calculated by monkeybread.calc.cell_contact().
contact_radius (Optional[float] (default: None)) – The radius in which cells are considered touching. If not provided, will be calculated using half of the average radius of group1 + half of the average radius of group2. This requires width and height columns to be present in adata.obs. Should be the same as used in monkeybread.calc.cell_contact().
perm_radius (Optional[float] (default: 100)) – The radius within which to randomize location, in coordinate units.
n_perms (Optional[int] (default: 1000)) – The number of permutations to run.
split_groups (Optional[bool] (default: False)) – Perform calculations using each possible pair from group1 and group2 instead of considering the groups as a whole.
basis (Optional[str] (default: 'spatial')) – Coordinates in adata.obsm[X_{basis}] to use. Defaults to spatial.

Return type

Union[Tuple[ndarray, float], DataFrame]

Returns

If split_groups = False, a length-two tuple will be returned. The first element is an array containing the number of contacts observed for each permutation. The second element is a p_value comparing the expected contact to the actual contact. If split_groups = True, a dataframe will be provided where each cell contains p_val for that combination of group1 (columns) and group2 (rows).