alifestd_downsample_tips_lineage_polars
- alifestd_downsample_tips_lineage_polars(phylogeny_df: ~polars.dataframe.frame.DataFrame, n_downsample: int, seed: int | None = None, *, criterion_delta: str | ~polars.expr.expr.Expr = 'origin_time', criterion_target: str | ~polars.expr.expr.Expr = 'origin_time', progress_wrap: ~typing.Callable = <function <lambda>>) DataFrame
Retain the n_downsample leaves closest to the lineage of a target leaf.
Selects a target leaf as the leaf with the largest criterion_target value (ties broken randomly). For each leaf, the most recent common ancestor (MRCA) with the target leaf is identified and the “off-lineage delta” is computed as the absolute difference between the leaf’s criterion_delta value and its MRCA’s criterion_delta value. The n_downsample leaves with the smallest off-lineage deltas are retained.
If n_downsample is greater than or equal to the number of leaves in the phylogeny, the whole phylogeny is returned. Ties in off-lineage delta are broken arbitrarily.
Only supports asexual phylogenies.
Parameters
- phylogeny_dfpolars.DataFrame or polars.LazyFrame
The phylogeny as a dataframe in alife standard format.
Must represent an asexual phylogeny.
- n_downsampleint
Number of tips to retain.
- seedint, optional
Random seed for reproducible target-leaf selection when there are ties in criterion_target.
- criterion_deltastr or polars.Expr, default “origin_time”
Column name or polars expression used to compute the off-lineage delta for each leaf. The delta is the absolute difference between a leaf’s value and its MRCA’s value.
- criterion_targetstr or polars.Expr, default “origin_time”
Column name or polars expression used to select the target leaf. The leaf with the largest value is chosen as the target. Note that ties are broken by random sample, allowing a seed to be provided.
- progress_wrapCallable, optional
Pass tqdm or equivalent to display a progress bar.
Raises
- NotImplementedError
If phylogeny_df has no “ancestor_id” column or if ids are non-contiguous or not topologically sorted.
- ValueError
If criterion_delta or criterion_target is not a column in phylogeny_df.
Returns
- polars.DataFrame
The pruned phylogeny in alife standard format.
See Also
- alifestd_downsample_tips_lineage_asexual :
Pandas-based implementation.