Majority Voting#
Majority voting#
This module allows majority-voting per cell for multiple cell-assignments of labels. The main computation is the mode per cell across multiple columns in the AnnData object.
Environments#
The following environments are needed for the different metrics. Depending on which metrics you want to run, you do not need to install all environments.
Configuration#
DATASETS:
test:
input:
majority_voting:
file_1: test/input/pbmc68k.h5ad
file_2: test/input/pbmc68k.h5ad
majority_voting:
columns:
- bulk_labels
- na_*
threshold: 0.6
columns: columns or patterns (parsable byre) of columns in.obsto consider for majority voting.threshold: threshold to determine whether a label has low consensus agreement. If the majority agreement (#major_label/#columns) is lower thanthreshold, itโs consider to have low consensus agreement.
Note: The main assumption of the
columnsis that the labels across columns are from the same (or mostly same) set of unique labels/
Output#
Additionally to the predicted labels, the module evaluates how strongly cells agree on their consensus label, and produces summary files and plots describing agreement levels.
<out_dir>/majority_voting/dataset~<datasets>/file_id~<file_id>.zarr: Updated AnnData object with consensus label (majority_consensus) based on the most common value across selected columns, alongside a measure of how many columns agree (majority_consensus_agreement), and a Boolean flag for low agreement (majority_consensus_low_agreement). This table is written to a TSV file named majority_consensus_agreement.tsv in the output plots folder.<image_dir>/majority_voting/dataset~<datasets>/file_id~<file_id>/majority_consensus_agreement.tsv: Summary statistic of majority voting containing fractions of cells for every consensus label and agreement level.<image_dir>/majority_voting/dataset~<datasets>/file_id~<file_id>/majority_consensus_frac.png: Agreement bar plot showing the fraction of cells that have low agreement for each consensus label.