alifestd_mark_sample_tips_clade_polars

alifestd_mark_sample_tips_clade_polars(phylogeny_df: DataFrame, n_sample: int, seed: int | None = None, *, mark_as: str = 'alifestd_mark_sample_tips_clade_polars') DataFrame

Mark tips belonging to a randomly sampled clade of at most n_sample tips.

Adds a boolean column mark_as indicating retained tips. Candidate clades are sampled proportionally to their size.

If n_sample is greater than the number of tips in the phylogeny, all tips are marked.

Only supports asexual phylogenies.

Parameters

phylogeny_dfpolars.DataFrame

The phylogeny as a dataframe in alife standard format.

Must represent an asexual phylogeny.

n_sampleint

Number of tips to mark.

seedint, optional

Integer seed for deterministic behavior.

mark_asstr, default “alifestd_mark_sample_tips_clade_polars”

Column name for the boolean mark.

Raises

NotImplementedError

If phylogeny_df has no “ancestor_id” column or if ids are non-contiguous or not topologically sorted.

Returns

polars.DataFrame

The phylogeny with an added boolean mark column.

See Also

alifestd_mark_sample_tips_clade_asexual :

Pandas-based implementation.