legacy

Functions

`alifestd_add_global_root`(phylogeny_df[, ...])	Add a new global root node that all existing roots point to.
`alifestd_add_global_root_polars`(phylogeny_df)	Add a new global root node that all existing roots point to.
`alifestd_add_inner_knuckles_asexual`(phylogeny_df)	For all inner nodes, add a subtending unifurcation ("knuckle").
`alifestd_add_inner_knuckles_polars`(phylogeny_df)	For all inner nodes, add a subtending unifurcation ("knuckle").
`alifestd_add_inner_leaves`(phylogeny_df[, mutate])	Create a zero-length branch with leaf node for each inner node.
`alifestd_add_inner_niblings_asexual`(phylogeny_df)	For all inner nodes, add a subtending unifurcation, adding a "nibling" leaf as the child of the knuckle.
`alifestd_add_inner_niblings_polars`(phylogeny_df)	For all inner nodes, add a subtending unifurcation, adding a "nibling" leaf as the child of the knuckle.
`alifestd_aggregate_phylogenies`(phylogeny_dfs)	Concatenate independent phylogenies, reassigning organism ids to prevent collisions.
`alifestd_aggregate_phylogenies_polars`(...)	Concatenate independent phylogenies, reassigning organism ids to prevent collisions.
`alifestd_as_newick_asexual`(phylogeny_df[, ...])	Convert phylogeny dataframe to Newick format.
`alifestd_as_newick_polars`(phylogeny_df, *[, ...])	Convert phylogeny dataframe to Newick format.
`alifestd_assign_contiguous_ids`(phylogeny_df)	Reassign so each organism's id corresponds to its row number.
`alifestd_assign_contiguous_ids_polars`(...)	Reassign so each organism's id corresponds to its row number.
`alifestd_assign_root_ancestor_token`(...[, ...])	Set root_ancestor_token for "ancestor_list" column.
`alifestd_calc_clade_lookback_n_asexual`(...)	Find ancestor ids of nodes that are lookback_n nodes away in the phylogeny.
`alifestd_calc_clade_lookback_origin_time_delta_asexual`(...)	Find ancestor ids of nodes that precede each phylogeny node by at least lookback_origin_time_delta branch distance.
`alifestd_calc_clade_trait_count_asexual`(...)	Count how many nodes within each clade have a given trait.
`alifestd_calc_clade_trait_frequency_asexual`(...)	Calculate what fraction of nodes within each clade have a given trait.
`alifestd_calc_distance_matrix_asexual`(...[, ...])	Calculate pairwise distances between all taxa via their MRCAs.
`alifestd_calc_distance_matrix_polars`(...[, ...])	Calculate pairwise distances between all taxa via their MRCAs.
`alifestd_calc_mrca_id_matrix_asexual`(...[, ...])	Calculate the Most Recent Common Ancestor (MRCA) taxon id for each pair of taxa.
`alifestd_calc_mrca_id_matrix_asexual_polars`(...)	Calculate the Most Recent Common Ancestor (MRCA) taxon id for each pair of taxa.
`alifestd_calc_mrca_id_vector_asexual`(...[, ...])	Calculate the Most Recent Common Ancestor (MRCA) taxon id for target_id and each other taxon.
`alifestd_calc_mrca_id_vector_asexual_polars`(...)	Calculate the Most Recent Common Ancestor (MRCA) taxon id for target_id and each other taxon.
`alifestd_calc_polytomic_index`(phylogeny_df)	Count how many fewer inner nodes are contained in phylogeny than expected if strictly bifurcationg.
`alifestd_calc_polytomic_index_polars`(...)	Count how many fewer inner nodes are contained in phylogeny than expected if strictly bifurcating.
`alifestd_categorize_triplet_asexual`(...[, ...])	Assess the topological configuration of three id's in phylogeny_df.
`alifestd_check_topological_sensitivity`(...)	Return names of columns present in phylogeny_df that may be invalidated by topological operations such as collapsing unifurcations.
`alifestd_check_topological_sensitivity_polars`(...)	Return names of columns present in phylogeny_df that may be invalidated by topological operations such as collapsing unifurcations.
`alifestd_chronological_sort`(phylogeny_df[, ...])	Sort rows so all organisms appear in chronological order, default origin_time.
`alifestd_chronological_sort_polars`(phylogeny_df)	Sort rows so all organisms appear in chronological order, default origin_time.
`alifestd_coarsen_dilate_asexual`(phylogeny_df, *)	Coarsen a phylogeny by collapsing inner nodes within dilation windows.
`alifestd_coarsen_dilate_polars`(phylogeny_df, *)	Coarsen a phylogeny by collapsing inner nodes within dilation windows.
`alifestd_coarsen_mask`(phylogeny_df, mask[, ...])	Pare record to bypass organisms outside mask.
`alifestd_coarsen_taxa_asexual`(phylogeny_df)	Condense consecutive phylogeny nodes sharing identical trait values, according to values in by column(s).
`alifestd_coarsen_taxa_asexual_make_agg`(...)	Build per-column aggregation rules for asexual taxa coarsening.
`alifestd_coerce_chronological_consistency`(...)	For any taxa with origin time preceding its parent's, set origin time to parent's origin time.
`alifestd_collapse_trunk_asexual`(phylogeny_df)	Collapse entries masked by is_trunk column, keeping only the oldest root.
`alifestd_collapse_trunk_polars`(phylogeny_df)	Collapse entries masked by is_trunk column, keeping only the oldest root.
`alifestd_collapse_unifurcations`(phylogeny_df)	Pare record to bypass organisms with one ancestor and one descendant.
`alifestd_collapse_unifurcations_polars`(...)	Pare record to bypass organisms with one ancestor and one descendant.
`alifestd_convert_root_ancestor_token`(...[, ...])	Set root_ancestor_token for ancestor_list series.
`alifestd_count_children_of_asexual`(...[, mutate])	How many taxa are direct descendants of the given parent?
`alifestd_count_children_of_polars`(...)	How many taxa are direct descendants of the given parent?
`alifestd_count_inner_nodes`(phylogeny_df[, ...])	Count how many non-leaf nodes are contained in phylogeny.
`alifestd_count_inner_nodes_polars`(phylogeny_df)	Count how many non-leaf nodes are contained in phylogeny.
`alifestd_count_leaf_nodes`(phylogeny_df)	How many leaf nodes are contained in phylogeny?
`alifestd_count_leaf_nodes_polars`(phylogeny_df)	How many leaf nodes are contained in phylogeny?
`alifestd_count_polytomies`(phylogeny_df)	Count how many inner nodes have more than two descendant nodes.
`alifestd_count_polytomies_polars`(phylogeny_df)	Count how many inner nodes have more than two descendant nodes.
`alifestd_count_root_nodes`(phylogeny_df)	How many root nodes are contained in phylogeny?
`alifestd_count_root_nodes_polars`(phylogeny_df)	How many root nodes are contained in phylogeny?
`alifestd_count_unifurcating_roots_asexual`(...)	How many root nodes with one child are contained in phylogeny?
`alifestd_count_unifurcating_roots_polars`(...)	How many root nodes with one child are contained in phylogeny?
`alifestd_count_unifurcations`(phylogeny_df)	Count how many inner nodes have exactly one descendant node.
`alifestd_count_unifurcations_polars`(phylogeny_df)	Count how many inner nodes have exactly one descendant node.
`alifestd_delete_trunk_asexual`(phylogeny_df)	Delete entries masked by is_trunk column.
`alifestd_delete_trunk_asexual_polars`(...)	Delete entries masked by is_trunk column.
`alifestd_delete_unifurcating_roots_asexual`(...)	Pare record to bypass root nodes with only one descendant.
`alifestd_delete_unifurcating_roots_polars`(...)	Pare record to bypass root nodes with only one descendant.
`alifestd_downsample_tips_asexual`(...[, ...])	Create a subsample phylogeny containing n_downsample tips.
`alifestd_downsample_tips_canopy_asexual`(...)	Retain the n_downsample leaves with the largest criterion values and prune extinct lineages.
`alifestd_downsample_tips_canopy_polars`(...)	Retain the n_downsample leaves with the largest criterion values and prune extinct lineages.
`alifestd_downsample_tips_clade_asexual`(...)	Create a subsample phylogeny containing at most n_downsample tips, comprising a single clade within the original phylogeny.
`alifestd_downsample_tips_clade_polars`(...[, ...])	Create a subsample phylogeny containing at most n_downsample tips, comprising a single clade within the original phylogeny.
`alifestd_downsample_tips_lineage_asexual`(...)	Retain the n_downsample leaves closest to the lineage of a target leaf.
`alifestd_downsample_tips_lineage_polars`(...)	Retain the n_downsample leaves closest to the lineage of a target leaf.
`alifestd_downsample_tips_lineage_stratified_asexual`(...)	Retain leaves per stratified group, chosen by proximity to the lineage of a target leaf.
`alifestd_downsample_tips_lineage_stratified_polars`(...)	Retain leaves per stratified group, chosen by proximity to the lineage of a target leaf.
`alifestd_downsample_tips_polars`(...[, seed])	Create a subsample phylogeny containing n_downsample tips.
`alifestd_downsample_tips_uniform_asexual`(...)	Create a subsample phylogeny containing n_downsample tips.
`alifestd_downsample_tips_uniform_polars`(...)	Create a subsample phylogeny containing n_downsample tips.
`alifestd_drop_topological_sensitivity`(...[, ...])	Drop columns from phylogeny_df that may be invalidated by topological operations such as collapsing unifurcations.
`alifestd_drop_topological_sensitivity_polars`(...)	Drop columns from phylogeny_df that may be invalidated by topological operations such as collapsing unifurcations.
`alifestd_estimate_triplet_distance_asexual`(...)	Estimate the triplet distance between two asexual phylogenetic trees in alife sampling sets of three leaf taxa and counting the fraction whose phylogenetic connectivity mismatch between trees.
`alifestd_find_chronological_inconsistency`(...)	Return the id of a taxon with origin time preceding its parent's, if any are present.
`alifestd_find_chronological_inconsistency_polars`(...)	Return the id of a taxon with origin time preceding its parent's, if any are present.
`alifestd_find_leaf_ids`(phylogeny_df)	What ids are not listed in any ancestor_list?
`alifestd_find_leaf_ids_polars`(phylogeny_df)	What ids are ancestor to no other ids?
`alifestd_find_mrca_id_asexual`(phylogeny_df, ...)	Find most recent common ancestor of leaf_ids.
`alifestd_find_pair_distance_asexual`(...[, ...])	Find the pairwise distance between two taxa via their MRCA.
`alifestd_find_pair_distance_polars`(...[, ...])	Find the pairwise distance between two taxa via their MRCA.
`alifestd_find_pair_mrca_id_asexual`(...[, ...])	Find the Most Recent Common Ancestor of two taxa.
`alifestd_find_pair_mrca_id_polars`(...[, ...])	Find the Most Recent Common Ancestor of two taxa.
`alifestd_find_root_ids`(phylogeny_df)	What ids have an empty ancestor_list?
`alifestd_find_root_ids_polars`(phylogeny_df)	What ids have an empty ancestor_list?
`alifestd_from_avida_spop`(spop_text, *[, ...])	Convert Avida `.spop` population snapshot text to a phylogeny dataframe.
`alifestd_from_avida_spop_polars`(spop_text, *)	Convert Avida `.spop` population snapshot text to a phylogeny dataframe.
`alifestd_from_newick`(newick, *[, ...])	Convert a Newick format string to a phylogeny dataframe.
`alifestd_from_newick_polars`(newick, *[, ...])	Convert a Newick format string to a phylogeny dataframe.
`alifestd_has_compact_ids`(phylogeny_df)	Are id values between 0 and len(phylogeny_df), in any order?
`alifestd_has_compact_ids_polars`(phylogeny_df)	Are id values between 0 and len(phylogeny_df), in any order?
`alifestd_has_contiguous_ids`(phylogeny_df)	Do organisms ids' correspond to their row number?
`alifestd_has_contiguous_ids_polars`(phylogeny_df)	Do organisms ids' correspond to their row number?
`alifestd_has_increasing_ids`(phylogeny_df)	Do offspring have larger id values than ancestors?
`alifestd_has_increasing_ids_polars`(phylogeny_df)	Do offspring have larger id values than ancestors?
`alifestd_has_multiple_roots`(phylogeny_df)	Does the phylogeny two or more root organisms?
`alifestd_has_multiple_roots_polars`(phylogeny_df)	Does the phylogeny have two or more root organisms?
`alifestd_is_asexual`(phylogeny_df)	Do all organisms in the phylogeny have one or no immediate ancestor?
`alifestd_is_asexual_polars`(phylogeny_df)	Do all organisms in the phylogeny have one or no immediate ancestor?
`alifestd_is_chronologically_ordered`(phylogeny_df)	Do any organisms have origin_time`s preceding members of their `ancestor_list?
`alifestd_is_chronologically_ordered_polars`(...)	Check if all taxa have origin times at or after their ancestor's origin time.
`alifestd_is_chronologically_sorted`(phylogeny_df)	Do rows appear in chronological order?
`alifestd_is_chronologically_sorted_polars`(...)	Do rows appear in chronological order?
`alifestd_is_sexual`(phylogeny_df)	Do any organisms in the phylogeny have than one immediate ancestor?
`alifestd_is_sexual_polars`(phylogeny_df)	Do any organisms in the phylogeny have more than one immediate ancestor?
`alifestd_is_strictly_bifurcating_asexual`(...)	Are all organisms listed after members of their ancestor_list?
`alifestd_is_strictly_bifurcating_polars`(...)	Are all internal nodes strictly bifurcating (exactly 2 children)?
`alifestd_is_topologically_sorted`(phylogeny_df)	Are all organisms listed after members of their ancestor_list?
`alifestd_is_topologically_sorted_polars`(...)	Are all organisms listed after members of their ancestor_list?
`alifestd_is_ultrametric`(phylogeny_df[, ...])	Do all tips share the same origin_time (within `atol`)?
`alifestd_is_ultrametric_polars`(phylogeny_df, *)	Do all tips share the same origin_time (within `atol`)?
`alifestd_is_working_format_asexual`(phylogeny_df)	Test if phylogeny_df is an asexual phylogeny in working format.
`alifestd_is_working_format_polars`(phylogeny_df)	Test if phylogeny_df is an asexual phylogeny in working format.
`alifestd_join_roots`(phylogeny_df[, mutate])	Point all other roots to oldest root, measured by lowest origin_time (if available) or otherwise lowest id.
`alifestd_join_roots_polars`(phylogeny_df)	Point all other roots to oldest root, measured by lowest origin_time (if available) or otherwise lowest id.
`alifestd_ladderize_asexual`(phylogeny_df[, ...])	Reorder rows so children are sorted by number of descendant leaves, gathering children into contiguous rows.
`alifestd_ladderize_polars`(phylogeny_df[, ...])	Reorder rows so children are sorted by number of descendant leaves, gathering children into contiguous rows.
`alifestd_make_ancestor_id_col`(ids, ...)	Translate ancestor ids from a column of singleton `ancestor_list`s into a pure-integer series representation.
`alifestd_make_ancestor_id_col_polars`(ids, ...)	Translate ancestor ids from a column of singleton `ancestor_list`s into a pure-integer series representation.
`alifestd_make_ancestor_list_col`(ids, ...[, ...])	Translate a column of integer ancestor id values into alife standard
`alifestd_make_ancestor_list_col_polars`(ids, ...)	Translate a column of integer ancestor id values into alife standard ancestor_list representation.
`alifestd_make_balanced_bifurcating`(depth)	Build a perfectly balanced bifurcating tree of given depth.
`alifestd_make_balanced_bifurcating_polars`(depth)	Build a perfectly balanced bifurcating tree of given depth.
`alifestd_make_comb`(n_leaves)	Build a comb/caterpillar tree with n_leaves leaves.
`alifestd_make_comb_polars`(n_leaves)	Build a comb/caterpillar tree with n_leaves leaves.
`alifestd_make_edge_split`(n_leaves[, seed])	Build a random bifurcating tree via edge-split (PDA) sampling.
`alifestd_make_edge_split_polars`(n_leaves[, seed])	Build a random bifurcating tree via edge-split (PDA) sampling.
`alifestd_make_empty`([ancestor_id])	Create an alife standard phylogeny dataframe with zero rows.
`alifestd_make_empty_polars`([ancestor_id])	Create an alife standard phylogeny dataframe with zero rows.
`alifestd_make_leaf_split`(n_leaves[, seed])	Build a random bifurcating tree via leaf-split (Yule) sampling.
`alifestd_make_leaf_split_polars`(n_leaves[, seed])	Build a random bifurcating tree via leaf-split (Yule) sampling.
`alifestd_make_star`(n_leaves)	Build a star tree with n_leaves leaves.
`alifestd_make_star_polars`(n_leaves)	Build a star tree with n_leaves leaves.
`alifestd_mark_ancestor_origin_time_asexual`(...)	Add column ancestor_origin_time.
`alifestd_mark_ancestor_origin_time_polars`(...)	Add column ancestor_origin_time.
`alifestd_mark_clade_duration_asexual`(...[, ...])	Add column clade_duration, containing the difference between each the origin_time of each node and the maximum origin_time of its descendants.
`alifestd_mark_clade_duration_polars`(...[, ...])	Add column clade_duration, containing the difference between each node's origin_time and the maximum origin_time of its descendants.
`alifestd_mark_clade_duration_ratio_sister_asexual`(...)	Add column clade_duration_ratio_sister, containing the ratio of each clade's duration to that of its sister.
`alifestd_mark_clade_duration_ratio_sister_polars`(...)	Add column clade_duration_ratio_sister, containing the ratio of each clade's duration to that of its sister.
`alifestd_mark_clade_faithpd_asexual`(phylogeny_df)	Add column clade_faithpd, containing sum branch length among descendant noes.
`alifestd_mark_clade_faithpd_polars`(...[, ...])	Add column clade_faithpd, containing sum branch length among descendant nodes.
`alifestd_mark_clade_fblr_growth_children_asexual`(...)	Add column clade_fblr_growth_children, containing the coefficient of a fblr regression fit to origin times of the leaf descendants of each node.
`alifestd_mark_clade_fblr_growth_sister_asexual`(...)	Add column clade_fblr_growth_children, containing the coefficient of a fblr regression fit to origin times of this clade's descendant leaves versus those of its sister clade.
`alifestd_mark_clade_leafcount_ratio_sister_asexual`(...)	Add column clade_leafcount_ratio_sister, containing the ratio of each clade's leaf count to that of its sister.
`alifestd_mark_clade_leafcount_ratio_sister_polars`(...)	Add column clade_leafcount_ratio_sister, containing the ratio of each clade's leaf count to that of its sister.
`alifestd_mark_clade_logistic_growth_children_asexual`(...)	Add column clade_logistic_growth_children, containing the coefficient of a logistic regression fit to origin times of the leaf descendants of each node.
`alifestd_mark_clade_logistic_growth_sister_asexual`(...)	Add column clade_logistic_growth_children, containing the coefficient of a logistic regression fit to origin times of this clade's descendant leaves versus those of its sister clade.
`alifestd_mark_clade_nodecount_ratio_sister_asexual`(...)	Add column clade_nodecount_ratio_sister, containing the ratio of each clade size to that of its sister.
`alifestd_mark_clade_nodecount_ratio_sister_polars`(...)	Add column clade_nodecount_ratio_sister, containing the ratio of each clade size to that of its sister.
`alifestd_mark_clade_subtended_duration_asexual`(...)	Add column clade_subtended_duration, containing the difference between each the origin_time of each node's ancestor and the maximum origin_time of its descendants.
`alifestd_mark_clade_subtended_duration_polars`(...)	Add column clade_subtended_duration, containing the difference between each node's ancestor's origin_time and the maximum origin_time of its descendants.
`alifestd_mark_clade_subtended_duration_ratio_sister_asexual`(...)	Add column clade_subtended_duration_ratio_sister, containing the ratio of each clade's subtended duration to that of its sister.
`alifestd_mark_clade_subtended_duration_ratio_sister_polars`(...)	Add column clade_subtended_duration_ratio_sister, containing the ratio of each clade's subtended duration to that of its sister.
`alifestd_mark_colless_index_asexual`(phylogeny_df)	Add column colless_index with Colless imbalance index for each subtree.
`alifestd_mark_colless_index_corrected_asexual`(...)	Add column colless_index_corrected with the corrected Colless index for each subtree.
`alifestd_mark_colless_index_corrected_polars`(...)	Add column colless_index_corrected with the corrected Colless index for each subtree.
`alifestd_mark_colless_index_polars`(...[, ...])	Add column colless_index with Colless imbalance index for each subtree.
`alifestd_mark_colless_like_index_mdm_asexual`(...)	Add column colless_like_index_mdm with Colless-like index using mean deviation from the median (MDM) as dissimilarity.
`alifestd_mark_colless_like_index_mdm_polars`(...)	Add column colless_like_index_mdm with Colless-like index using mean deviation from the median (MDM) as dissimilarity.
`alifestd_mark_colless_like_index_sd_asexual`(...)	Add column colless_like_index_sd with Colless-like index using sample standard deviation as dissimilarity.
`alifestd_mark_colless_like_index_sd_polars`(...)	Add column colless_like_index_sd with Colless-like index using sample standard deviation as dissimilarity.
`alifestd_mark_colless_like_index_var_asexual`(...)	Add column colless_like_index_var with Colless-like index using sample variance as dissimilarity.
`alifestd_mark_colless_like_index_var_polars`(...)	Add column colless_like_index_var with Colless-like index using sample variance as dissimilarity.
`alifestd_mark_csr_children_asexual`(phylogeny_df)	Add column csr_children, a flat array of child ids grouped by parent according to CSR offsets from the csr_offsets column.
`alifestd_mark_csr_children_polars`(...[, mark_as])	Add column csr_children, a flat array of child ids grouped by parent according to CSR offsets from the csr_offsets column.
`alifestd_mark_csr_offsets_asexual`(phylogeny_df)	Add column csr_offsets, the CSR offset where each node's children begin in the corresponding csr_children array.
`alifestd_mark_csr_offsets_polars`(phylogeny_df, *)	Add column csr_offsets, the CSR offset where each node's children begin in the corresponding csr_children array.
`alifestd_mark_first_child_id_asexual`(...[, ...])	Add column first_child_id, the smallest-id child of each node.
`alifestd_mark_first_child_id_polars`(...[, ...])	Add column first_child_id, the smallest-id child of each node.
`alifestd_mark_is_left_child_asexual`(phylogeny_df)	Add column is_left_child, containing for each node whether it is the smaller-id child.
`alifestd_mark_is_left_child_polars`(...[, ...])	Add column is_left_child, containing for each node whether it is the smaller-id child.
`alifestd_mark_is_right_child_asexual`(...[, ...])	Add column is_right_child, containing for each node whether it is the larger-id child.
`alifestd_mark_is_right_child_polars`(...[, ...])	Add column is_right_child, containing for each node whether it is the larger-id child.
`alifestd_mark_leaves`(phylogeny_df[, mutate, ...])	What rows are ancestor to no other row?
`alifestd_mark_leaves_polars`(phylogeny_df, *)	Add column is_leaf marking rows that are ancestor to no other row.
`alifestd_mark_left_child_asexual`(phylogeny_df)	Add column left_child, containing for each node its smallest-id child.
`alifestd_mark_left_child_polars`(phylogeny_df, *)	Add column left_child_id, containing for each node its smallest-id child.
`alifestd_mark_lineage_cummax_asexual`(...[, ...])	Add column with maximum of `values` along each lineage.
`alifestd_mark_lineage_cummax_polars`(...[, ...])	Add column with maximum of `values` along each lineage.
`alifestd_mark_lineage_cummin_asexual`(...[, ...])	Add column with minimum of `values` along each lineage.
`alifestd_mark_lineage_cummin_polars`(...[, ...])	Add column with minimum of `values` along each lineage.
`alifestd_mark_lineage_cumprod_asexual`(...[, ...])	Add column with cumulative product of `values` along each lineage.
`alifestd_mark_lineage_cumprod_polars`(...[, ...])	Add column with cumulative product of `values` along each lineage.
`alifestd_mark_lineage_cumsum_asexual`(...[, ...])	Add column with cumulative sum of `values` along each lineage.
`alifestd_mark_lineage_cumsum_polars`(...[, ...])	Add column with cumulative sum of `values` along each lineage.
`alifestd_mark_max_descendant_origin_time_asexual`(...)	Add column max_descendant_origin_time, excluding self.
`alifestd_mark_max_descendant_origin_time_polars`(...)	Add column max_descendant_origin_time, excluding self.
`alifestd_mark_next_sibling_id_asexual`(...[, ...])	Add column next_sibling_id, the next-highest id sharing the same parent.
`alifestd_mark_next_sibling_id_polars`(...[, ...])	Add column next_sibling_id, the next-highest id sharing the same parent.
`alifestd_mark_node_depth_asexual`(phylogeny_df)	Add column node_depth, counting the number of nodes between a node and the root.
`alifestd_mark_node_depth_polars`(phylogeny_df, *)	Add column node_depth, counting the number of nodes between a node and the root.
`alifestd_mark_num_children_asexual`(phylogeny_df)	Add column num_children, counting for each node the number of nodes it is parent to.
`alifestd_mark_num_children_polars`(...[, mark_as])	Add column num_children, counting for each node the number of nodes it is parent to.
`alifestd_mark_num_descendants_asexual`(...[, ...])	Add column num_descendants, excluding self.
`alifestd_mark_num_descendants_polars`(...[, ...])	Add column num_descendants, excluding self.
`alifestd_mark_num_leaves_asexual`(phylogeny_df)	Add column num_leaves with count of all descendant leaves, including self if a leaf.
`alifestd_mark_num_leaves_polars`(phylogeny_df, *)	Add column num_leaves with count of all descendant leaves, including self if a leaf.
`alifestd_mark_num_leaves_sibling_asexual`(...)	Mark the number of leaves descendant from each node's siblings.
`alifestd_mark_num_leaves_sibling_polars`(...)	Mark the number of leaves descendant from each node's siblings.
`alifestd_mark_num_preceding_leaves_asexual`(...)	Add column num_preceding_leaves with count of all leaves occurring before the present node in an inorder traversal.
`alifestd_mark_num_preceding_leaves_polars`(...)	Add column num_preceding_leaves with count of all leaves occurring before the present node in an inorder traversal.
`alifestd_mark_oldest_root`(phylogeny_df[, ...])	Point all other roots to oldest root, measured by lowest origin_time (if available) or otherwise lowest id.
`alifestd_mark_oldest_root_polars`(phylogeny_df, *)	Point all other roots to oldest root, measured by lowest origin_time (if available) or otherwise lowest id.
`alifestd_mark_origin_time_delta_asexual`(...)	Add columns origin_time_delta and ancestor_origin_time.
`alifestd_mark_origin_time_delta_polars`(...)	Add columns origin_time_delta and ancestor_origin_time.
`alifestd_mark_ot_mrca_asexual`(phylogeny_df)	Appends columns characterizing the Most Recent Common Ancestor (MRCA) of the entire extant population at each taxon's origin_time.
`alifestd_mark_ot_mrca_polars`(phylogeny_df, *)	Appends columns characterizing the Most Recent Common Ancestor (MRCA) of the entire extant population at each taxon's origin_time.
`alifestd_mark_prev_sibling_id_asexual`(...[, ...])	Add column prev_sibling_id, the next-lowest id sharing the same parent.
`alifestd_mark_prev_sibling_id_polars`(...[, ...])	Add column prev_sibling_id, the next-lowest id sharing the same parent.
`alifestd_mark_right_child_asexual`(phylogeny_df)	Add column right_child, containing for each node its largest-id child.
`alifestd_mark_right_child_polars`(phylogeny_df, *)	Add column right_child_id, containing for each node its largest-id child.
`alifestd_mark_root_id`(phylogeny_df[, ...])	Add column root_id, containing the id of entries' ultimate ancestor.
`alifestd_mark_root_id_polars`(phylogeny_df, *)	Add column root_id, containing the id of entries' ultimate ancestor.
`alifestd_mark_roots`(phylogeny_df[, mutate, ...])	Create column is_root to mark rows with no ancestor.
`alifestd_mark_roots_polars`(phylogeny_df, *)	Create column is_root to mark rows with no ancestor.
`alifestd_mark_sackin_index_asexual`(phylogeny_df)	Add column sackin_index with Sackin index for each subtree.
`alifestd_mark_sackin_index_polars`(...[, mark_as])	Add column sackin_index with Sackin index for each subtree.
`alifestd_mark_sample_tips_asexual`(...[, ...])	Mark a random subsample of n_sample tips.
`alifestd_mark_sample_tips_canopy_asexual`(...)	Mark the n_sample leaves with the largest criterion values.
`alifestd_mark_sample_tips_canopy_polars`(...)	Mark the n_sample leaves with the largest criterion values.
`alifestd_mark_sample_tips_clade_asexual`(...)	Mark tips belonging to a randomly sampled clade of at most n_sample tips.
`alifestd_mark_sample_tips_clade_polars`(...)	Mark tips belonging to a randomly sampled clade of at most n_sample tips.
`alifestd_mark_sample_tips_lineage_asexual`(...)	Mark the n_sample leaves closest to the lineage of a target leaf.
`alifestd_mark_sample_tips_lineage_polars`(...)	Mark the n_sample leaves closest to the lineage of a target leaf.
`alifestd_mark_sample_tips_lineage_stratified_asexual`(...)	Mark leaves per stratified group, chosen by proximity to the lineage of a target leaf.
`alifestd_mark_sample_tips_lineage_stratified_polars`(...)	Mark leaves per stratified group, chosen by proximity to the lineage of a target leaf.
`alifestd_mark_sample_tips_polars`(...[, ...])	Mark a random subsample of n_sample tips.
`alifestd_mark_sample_tips_uniform_asexual`(...)	Mark a random subsample of n_sample tips.
`alifestd_mark_sample_tips_uniform_polars`(...)	Mark a random subsample of n_sample tips.
`alifestd_mark_sister_asexual`(phylogeny_df[, ...])	Add column sister, containing the id of each node's sibling.
`alifestd_mark_sister_polars`(phylogeny_df, *)	Add column sister_id, containing the id of each node's sibling.
`alifestd_mask_descendants_asexual`(phylogeny_df)	For given ancestor nodes, create a mask identifying those nodes and all descendants.
`alifestd_mask_descendants_polars`(...)	For given ancestor nodes, create a mask identifying those nodes and all descendants.
`alifestd_mask_monomorphic_clades_asexual`(...)	Compute a mask marking "monomorphic" clades where all members with a trait defined value share the same trait value.
`alifestd_parse_ancestor_id`(ancestor_list_str)	Parse at most a single ancestor id from an ancestor_list field.
`alifestd_parse_ancestor_ids`(ancestor_list_str)	Parse ancestor ids from an ancestor_list field.
`alifestd_pipe_unary_ops`(phylogeny_df, *unary_ops)	Pipe a phylogeny DataFrame through a sequence of unary operations.
`alifestd_pipe_unary_ops_polars`(phylogeny_df, ...)	Pipe a phylogeny DataFrame through a sequence of unary operations.
`alifestd_prefix_roots`(phylogeny_df, *[, ...])	Add new roots to the phylogeny, prefixing existing roots.
`alifestd_prefix_roots_polars`(phylogeny_df, *)	Add new roots to the phylogeny, prefixing existing roots.
`alifestd_prune_extinct_lineages_asexual`(...)	Drop taxa without extant descendants.
`alifestd_prune_extinct_lineages_polars`(...)	Drop taxa without extant descendants.
`alifestd_reroot_at_id_asexual`(phylogeny_df, ...)	Reroot phylogeny, preserving topology.
`alifestd_reroot_at_id_polars`(phylogeny_df, ...)	Reroot phylogeny at specified node id, preserving topology.
`alifestd_sample_triplet_comparisons_asexual`(...)	Sample triplet comparisons between two asexual phylogenetic trees in alife standard form, creating a DataFrame with the triplet categorizations and comparison results as well as corresponding data from MRCA row within the first tree.
`alifestd_screen_trait_defined_clades_fisher_asexual`(...)	Perform a screen for trait-defined clades based on Fisher's exact test.
`alifestd_screen_trait_defined_clades_fitch_asexual`(...)	Perform a maximum parsimony screen for trait-defined clades using Fitch's algorithm.
`alifestd_screen_trait_defined_clades_naive_asexual`(...)	Perform a naive screen for trait-defined clades.
`alifestd_sort_children_asexual`(phylogeny_df, ...)	Reorder rows so children are sorted by the given criterion column, gathering children into contiguous rows.
`alifestd_sort_children_polars`(phylogeny_df, ...)	Reorder rows so children are sorted by the given criterion column, gathering children into contiguous rows.
`alifestd_splay_polytomies`(phylogeny_df[, mutate])	Use a simple splay strategy to resolve polytomies, converting them into bifurcations.
`alifestd_splay_polytomies_polars`(phylogeny_df)	Use a simple splay strategy to resolve polytomies, converting them into bifurcations.
`alifestd_sum_origin_time_deltas_asexual`(...)	Sum differences between taxa origin times and their ancestors' origin time.
`alifestd_sum_origin_time_deltas_polars`(...)	Sum origin_time_delta values.
`alifestd_test_leaves_isomorphic_asexual`(df1, ...)	Test if phylogenetic relationships between leaf nodes are topologically isomorphic between two phylogenies.
`alifestd_test_leaves_isomorphic_polars`(df1, ...)	Test if phylogenetic relationships between leaf nodes are topologically isomorphic between two phylogenies.
`alifestd_to_iplotx_pandas`(phylogeny_df[, mutate])	Wrap a pandas phylogeny DataFrame for use with iplotx.
`alifestd_to_iplotx_polars`(phylogeny_df)	Wrap a polars phylogeny DataFrame for use with iplotx.
`alifestd_to_working_format`(phylogeny_df[, ...])	Re-encode phylogeny_df to facilitate efficient analysis and transformation operations.
`alifestd_to_working_format_polars`(phylogeny_df)	Re-encode phylogeny_df to facilitate efficient analysis and transformation operations.
`alifestd_topological_sensitivity_warned`(*, ...)	Decorator that emits a topological sensitivity warning before the wrapped function executes.
`alifestd_topological_sensitivity_warned_polars`(*, ...)	Decorator that emits a topological sensitivity warning before the wrapped function executes.
`alifestd_topological_sort`(phylogeny_df[, mutate])	Sort rows so all organisms follow members of their ancestor_list.
`alifestd_topological_sort_polars`(phylogeny_df)	Sort rows so all organisms follow members of their ancestor_id.
`alifestd_try_add_ancestor_id_col`(phylogeny_df)	Add an ancestor_id column to the input DataFrame if the phylogeny is asexual and the column does not already exist.
`alifestd_try_add_ancestor_id_col_polars`(...)	Add an ancestor_id column to the input DataFrame if the phylogeny is asexual and the column does not already exist.
`alifestd_try_add_ancestor_list_col`(phylogeny_df)	Add an ancestor_list column to the input DataFrame if the column does
`alifestd_try_add_ancestor_list_col_polars`(...)	Add an ancestor_list column to the input DataFrame if the column does not already exist.
`alifestd_ultrametricize`(phylogeny_df[, ...])	Adjust tip origin_time values so all tips share the same time.
`alifestd_ultrametricize_polars`(phylogeny_df, *)	Adjust tip origin_time values so all tips share the same time.
`alifestd_unfurl_lineage_asexual`(...[, mutate])	List leaf_id and its ancestor id sequence through tree root.
`alifestd_unfurl_traversal_inorder_asexual`(...)	List id values in semiorder traversal order, with left children visited first.
`alifestd_unfurl_traversal_inorder_polars`(...)	List node indices in inorder traversal order, with left children visited first.
`alifestd_unfurl_traversal_levelorder_asexual`(...)	List id values in levelorder (BFS) traversal order.
`alifestd_unfurl_traversal_levelorder_polars`(...)	List node indices in levelorder (BFS) traversal order.
`alifestd_unfurl_traversal_postorder_asexual`(...)	List id values in postorder traversal order.
`alifestd_unfurl_traversal_postorder_contiguous_asexual`(...)	List node indices in DFS postorder traversal order, with subtree contiguity.
`alifestd_unfurl_traversal_postorder_contiguous_polars`(...)	List node indices in DFS postorder traversal order, with subtree contiguity.
`alifestd_unfurl_traversal_postorder_polars`(...)	List node indices in postorder traversal order.
`alifestd_unfurl_traversal_preorder_asexual`(...)	List id values in DFS preorder traversal order.
`alifestd_unfurl_traversal_preorder_polars`(...)	List node indices in DFS preorder traversal order.
`alifestd_unfurl_traversal_semiorder_asexual`(...)	List id values in semiorder traversal order.
`alifestd_unfurl_traversal_semiorder_polars`(...)	List node indices in semiorder traversal order.
`alifestd_unfurl_traversal_topological_asexual`(...)	List id values in topological traversal order.
`alifestd_unfurl_traversal_topological_polars`(...)	List node indices in topological traversal order.
`alifestd_validate`(phylogeny_df[, mutate, ...])	Is the phylogeny compliant to alife data standards?
`alifestd_warn_topological_sensitivity`(...)	Emit a warning if phylogeny_df contains columns that may be invalidated by topological operations.
`alifestd_warn_topological_sensitivity_polars`(...)	Emit a warning if phylogeny_df contains columns that may be invalidated by topological operations.

Classes

`AlifestdIplotxShimNumpy`	Numpy-backed iplotx `TreeDataProvider` for alife-standard data.
`AlifestdIplotxShimPandas`	Iplotx `TreeDataProvider` for pandas alife-standard dataframes.
`AlifestdIplotxShimPolars`	Iplotx `TreeDataProvider` for polars alife-standard dataframes.

class AlifestdIplotxShimNumpy

Numpy-backed iplotx TreeDataProvider for alife-standard data.

This class assumes contiguous ids (id == row index) and topologically sorted rows (ancestors appear before descendants).

Parameters

ancestor_idsnp.ndarray: Integer array of ancestor ids; roots satisfy ancestor_ids[i] == i.
namesnp.ndarray, optional: Per-node name strings.
branch_lengthsnp.ndarray, optional: Per-node branch lengths (edge from parent to this node).

__init__(ancestor_ids: ndarray, names: ndarray | None = None, branch_lengths: ndarray | None = None) → None[source]

static check_dependencies() → bool[source]

static get_branch_length(node: _AlifestdNode) → float | None[source]

get_children(node: _AlifestdNode) → Sequence[_AlifestdNode][source]

get_leaves(node: _AlifestdNode | None = None) → Sequence[_AlifestdNode][source]

get_root() → _AlifestdNode[source]

get_subtree(node: _AlifestdNode) → AlifestdIplotxShimNumpy[source]

is_rooted() → bool[source]

levelorder() → Iterable[_AlifestdNode][source]

postorder() → Iterable[_AlifestdNode][source]

preorder() → Iterable[_AlifestdNode][source]

static tree_type() → type[source]

class AlifestdIplotxShimPandas

Iplotx TreeDataProvider for pandas alife-standard dataframes.

The dataframe must be asexual with contiguous ids and topologically sorted rows. An ancestor_id column will be derived from ancestor_list if needed.

Parameters

treepd.DataFrame: Pandas phylogeny dataframe in alife standard format.
mutatebool, default False: If True, allow modification of the input dataframe.

__init__(tree: DataFrame, mutate: bool = False) → None[source]

static check_dependencies() → bool[source]

static tree_type() → type[source]

class AlifestdIplotxShimPolars

Iplotx TreeDataProvider for polars alife-standard dataframes.

The dataframe must be asexual with contiguous ids and topologically sorted rows.

Parameters

treepolars.DataFrame: Polars phylogeny dataframe in alife standard format.

__init__(tree: DataFrame) → None[source]

static check_dependencies() → bool[source]

static tree_type() → type[source]

alifestd_add_global_root(phylogeny_df: DataFrame, mutate: bool = False, root_attrs: Mapping[str, Any] = mappingproxy({})) → DataFrame

Add a new global root node that all existing roots point to.

The new root node will have columns id, ancestor_id (if applicable), ancestor_list (if applicable), and any columns specified in root_attrs. All other columns will be NaN for the new root row.

Parameters

phylogeny_dfpd.DataFrame

Phylogeny dataframe in alife standard format.

mutatebool, default False

If True, allows mutation of the input dataframe.

root_attrsMapping[str, Any], default {}

Column values to set on the new global root row, e.g., {"origin_time": 0.0, "taxon_label": "root"}.

Keys "id", "ancestor_id", and "ancestor_list" are reserved and may not be specified; a ValueError is raised if any are present.

Returns

pd.DataFrame: The phylogeny dataframe with a new global root added.

Raises

ValueError: If root_attrs contains reserved keys.

Input dataframe is not mutated by this operation unless mutate set True. If mutate set True, operation does not occur in place; still use return value to get transformed phylogeny dataframe.

alifestd_add_global_root_polars(phylogeny_df: DataFrame) → DataFrame: Add a new global root node that all existing roots point to.

alifestd_add_inner_knuckles_asexual(phylogeny_df: DataFrame, mutate: bool = False) → DataFrame

For all inner nodes, add a subtending unifurcation (“knuckle”).

Input dataframe is not mutated by this operation unless mutate set True. If mutate set True, operation does not occur in place; still use return value to get transformed phylogeny dataframe.

alifestd_add_inner_knuckles_polars(phylogeny_df: DataFrame) → DataFrame

For all inner nodes, add a subtending unifurcation (“knuckle”).

Parameters

phylogeny_dfpolars.DataFrame

The phylogeny as a dataframe in alife standard format.

Must represent an asexual phylogeny with contiguous ids and topological sort order.

Returns

polars.DataFrame: The phylogeny with knuckle nodes added for each inner node.

Parameters

phylogeny_dfpolars.DataFrame or polars.LazyFrame

The phylogeny as a dataframe in alife standard format.

Must represent an asexual phylogeny with contiguous ids and topologically sorted rows.

Returns

polars.DataFrame: The phylogeny with inner niblings added.

Parameters

phylogeny_dfpd.DataFrame: Phylogeny dataframe in Alife standard format.
mutatebool, optional: Allow in-place mutations of the input dataframe, by default False.
progress_wraptyping.Callable, optional: Pass tqdm or equivalent to display a progress bar.
sep_foreststr, default “n”: Separator placed between the ;-terminated trees of a forest.
taxon_labelstr, optional: Column to use for taxon labels, by default None.
unsafe_symbolsstr, optional: Characters that force a taxon label to be single-quoted when present. Defaults to the Newick-reserved symbols (and whitespace).

alifestd_as_newick_polars(phylogeny_df: ~polars.dataframe.frame.DataFrame, *, taxon_label: str | None = None, progress_wrap: ~typing.Callable = <function <lambda>>) → str

Convert phylogeny dataframe to Newick format.

Parameters

phylogeny_dfpolars.DataFrame: Phylogeny dataframe in Alife standard format.
taxon_labelstr, optional: Column to use for taxon labels, by default None.
progress_wraptyping.Callable, optional: Pass tqdm or equivalent to display a progress bar.

Parameters

phylogeny_dfpd.DataFrame: Phylogeny in alife standard format.
mutatebool, default False: If True, allows in-place modification of phylogeny_df.
criterionstr, default “origin_time”: Column name used to measure distance between taxa and their MRCA.
progress_wrapcallable, optional: Wrapper for progress display (e.g., tqdm).

Returns

np.ndarray: n x n float64 matrix of pairwise distances. Entry [i, j] is NaN when organisms i and j share no common ancestor.

Parameters

phylogeny_dfpolars.DataFrame or polars.LazyFrame

The phylogeny as a dataframe in alife standard format.

Must represent an asexual phylogeny in working format (i.e., topologically sorted with contiguous ids and an ancestor_id column, or an ancestor_list column from which ancestor_id can be derived).

criterionstr or polars.Expr, default “origin_time”

Column name or polars expression used to measure distance between taxa and their MRCA.

progress_wrapcallable, optional

Wrapper for progress display (e.g., tqdm).

Returns

numpy.ndarray: Array of shape (n, n) with dtype float64, containing pairwise distances. Entries are NaN where organisms share no common ancestor.

Parameters

phylogeny_dfpolars.DataFrame or polars.LazyFrame

The phylogeny as a dataframe in alife standard format.

Must represent an asexual phylogeny in working format (i.e., topologically sorted with contiguous ids and an ancestor_id column, or an ancestor_list column from which ancestor_id can be derived).

progress_wrapcallable, optional

Wrapper for progress display (e.g., tqdm).

Returns

numpy.ndarray: Array of shape (n, n) with dtype int64, containing MRCA ids for each pair of organisms. Entries are -1 where organisms share no common ancestor.

Parameters

phylogeny_dfpolars.DataFrame or polars.LazyFrame

The phylogeny as a dataframe in alife standard format.

Must represent an asexual phylogeny in working format (i.e., topologically sorted with contiguous ids and an ancestor_id column, or an ancestor_list column from which ancestor_id can be derived).

target_idint

The target organism id to compute MRCA against.

progress_wrapcallable, optional

Wrapper for progress display (e.g., tqdm).

Returns

numpy.ndarray: Array of shape (n,) with dtype int64, containing MRCA ids for each organism with the target. Entries are -1 where organisms share no common ancestor with the target.

Parameters

phylogeny_dfpandas.DataFrame: The phylogeny as a dataframe in alife standard format.
insertbool: Whether the operation inserts new nodes.
deletebool: Whether the operation deletes nodes.
updatebool: Whether the operation updates ancestor relationships.

Input dataframe is not mutated by this operation.

Parameters

phylogeny_dfpolars.DataFrame or polars.LazyFrame: The phylogeny as a dataframe in alife standard format.
insertbool: Whether the operation inserts new nodes.
deletebool: Whether the operation deletes nodes.
updatebool: Whether the operation updates ancestor relationships.

Parameters

phylogeny_dfpd.DataFrame: Input phylogeny in alife standard format.
criterionstr, default “origin_time”: Column whose values define the time axis for dilation.
dilationint: Width of the dilation window. Must be a positive integer.
mutatebool, default False: If True, allow in-place mutation of the input dataframe.

Returns

pd.DataFrame: Coarsened phylogeny in alife standard format.

Raises

NotImplementedError: If input is not topologically sorted with contiguous ids.
ValueError: If dilation is not a positive integer, if criterion is not present in phylogeny_df, or if criterion is "id" or "ancestor_id".

alifestd_coarsen_dilate_polars(phylogeny_df: DataFrame | LazyFrame, *, criterion: str = 'origin_time', dilation: int = 1) → DataFrame

Coarsen a phylogeny by collapsing inner nodes within dilation windows.

All inner (non-leaf) nodes with criterion values in the half-open interval [n, n + dilation), where n % dilation == 0, are collapsed to a single inner node at n.

Tip nodes are never moved. The MRCA of two tips may only shift backward (never forward), by at most dilation units, and never across a n % dilation == 0 boundary.

Parameters

phylogeny_dfpolars.DataFrame or polars.LazyFrame: Input phylogeny in alife standard format.
criterionstr, default “origin_time”: Column whose values define the time axis for dilation.
dilationint: Width of the dilation window. Must be a positive integer.

Returns

polars.DataFrame: Coarsened phylogeny in alife standard format.

Raises

NotImplementedError: If input is not topologically sorted with contiguous ids.
ValueError: If dilation is not a positive integer, if criterion is not present in phylogeny_df, or if criterion is "id" or "ancestor_id".

Parameters

phylogeny_dfpd.DataFrame: Input phylogeny table.
default_aggstr, default “first”: Aggregation function to apply to any column not in the hard-coded overrides.

Returns

Dict[str, str]

Mapping of column name to aggregation method. Four columns are overridden as follows:

“destruction_time”: “last”
“is_root”: “first”
“origin_time”: “first”

Columns named

“ancestor_id”
“ancestor_list”
“branch_length”
“edge_length”
“id”
“is_leaf”

will be excluded from the result. All other (non-excluded) columns use default_agg.

alifestd_coerce_chronological_consistency(phylogeny_df: DataFrame, mutate: bool = False) → DataFrame

For any taxa with origin time preceding its parent’s, set origin time to parent’s origin time.

If an inconsistency is detected, the corrected phylogeny will be returned sorted in topological order.

alifestd_collapse_trunk_asexual(phylogeny_df: DataFrame, mutate: bool = False) → DataFrame

Collapse entries masked by is_trunk column, keeping only the oldest root.

Masked entries must be contiguous, meaning that no non-trunk entry can be an ancestor of a trunk entry.

Input dataframe is not mutated by this operation unless mutate set True. If mutate set True, operation does not occur in place; still use return value to get transformed phylogeny dataframe.

Parameters

phylogeny_dfpandas.DataFrame

The phylogeny as a dataframe in alife standard format.

Must represent an asexual phylogeny.

n_downsampleint, optional

Number of tips to retain. If None, defaults to the count of leaves with the maximum criterion value.

mutatebool, default False

Are side effects on the input argument phylogeny_df allowed?

criterionstr, default “origin_time”

Column name used to rank leaves. The n_downsample leaves with the largest values in this column are retained. Ties are broken arbitrarily.

Raises

ValueError: If criterion is not a column in phylogeny_df.

Returns

pandas.DataFrame: The pruned phylogeny in alife standard format.

alifestd_downsample_tips_canopy_polars(phylogeny_df: DataFrame, n_downsample: int | None = None, criterion: str | Expr = 'origin_time') → DataFrame

Retain the n_downsample leaves with the largest criterion values and prune extinct lineages.

If n_downsample is None, it defaults to the number of leaves that share the maximum value of the criterion column. If n_downsample is greater than or equal to the number of leaves in the phylogeny, the whole phylogeny is returned. Ties are broken arbitrarily.

Only supports asexual phylogenies.

Parameters

phylogeny_dfpolars.DataFrame

The phylogeny as a dataframe in alife standard format.

Must represent an asexual phylogeny.

n_downsampleint, optional

Number of tips to retain. If None, defaults to the count of leaves with the maximum criterion value.

criterionstr or polars.Expr, default “origin_time”

Column name or polars expression used to rank leaves. The n_downsample leaves with the largest values are retained. Ties are broken arbitrarily.

Raises

NotImplementedError: If phylogeny_df has no “ancestor_id” column.
ValueError: If criterion is not a column in phylogeny_df.

Returns

polars.DataFrame: The pruned phylogeny in alife standard format.

Parameters

phylogeny_dfpolars.DataFrame

The phylogeny as a dataframe in alife standard format.

Must represent an asexual phylogeny.

n_downsampleint

Number of tips to retain.

seedint, optional

Integer seed for deterministic behavior.

Raises

NotImplementedError: If phylogeny_df has no “ancestor_id” column or if ids are non-contiguous or not topologically sorted.

Returns

polars.DataFrame: The downsampled phylogeny in alife standard format.

Parameters

phylogeny_dfpandas.DataFrame

The phylogeny as a dataframe in alife standard format.

Must represent an asexual phylogeny.

n_downsampleint

Number of tips to retain.

mutatebool, default False

Are side effects on the input argument phylogeny_df allowed?

seedint, optional

Random seed for reproducible target-leaf selection when there are ties in criterion_target.

criterion_deltastr, default “origin_time”

Column name used to compute the off-lineage delta for each leaf. The delta is the absolute difference between a leaf’s value and its MRCA’s value in this column.

criterion_targetstr, default “origin_time”

Column name used to select the target leaf. The leaf with the largest value in this column is chosen as the target. Note that ties are broken by random sample, allowing a seed to be provided.

progress_wrapCallable, optional

Pass tqdm or equivalent to display a progress bar.

Raises

ValueError: If criterion_delta or criterion_target is not a column in phylogeny_df.

Returns

pandas.DataFrame: The pruned phylogeny in alife standard format.

alifestd_downsample_tips_lineage_polars(phylogeny_df: ~polars.dataframe.frame.DataFrame, n_downsample: int, seed: int | None = None, *, criterion_delta: str | ~polars.expr.expr.Expr = 'origin_time', criterion_target: str | ~polars.expr.expr.Expr = 'origin_time', progress_wrap: ~typing.Callable = <function <lambda>>) → DataFrame

Retain the n_downsample leaves closest to the lineage of a target leaf.

Selects a target leaf as the leaf with the largest criterion_target value (ties broken randomly). For each leaf, the most recent common ancestor (MRCA) with the target leaf is identified and the “off-lineage delta” is computed as the absolute difference between the leaf’s criterion_delta value and its MRCA’s criterion_delta value. The n_downsample leaves with the smallest off-lineage deltas are retained.

If n_downsample is greater than or equal to the number of leaves in the phylogeny, the whole phylogeny is returned. Ties in off-lineage delta are broken arbitrarily.

Only supports asexual phylogenies.

Parameters

phylogeny_dfpolars.DataFrame or polars.LazyFrame

The phylogeny as a dataframe in alife standard format.

Must represent an asexual phylogeny.

n_downsampleint

Number of tips to retain.

seedint, optional

Random seed for reproducible target-leaf selection when there are ties in criterion_target.

criterion_deltastr or polars.Expr, default “origin_time”

Column name or polars expression used to compute the off-lineage delta for each leaf. The delta is the absolute difference between a leaf’s value and its MRCA’s value.

criterion_targetstr or polars.Expr, default “origin_time”

Column name or polars expression used to select the target leaf. The leaf with the largest value is chosen as the target. Note that ties are broken by random sample, allowing a seed to be provided.

progress_wrapCallable, optional

Pass tqdm or equivalent to display a progress bar.

Raises

NotImplementedError: If phylogeny_df has no “ancestor_id” column or if ids are non-contiguous or not topologically sorted.
ValueError: If criterion_delta or criterion_target is not a column in phylogeny_df.

Returns

polars.DataFrame: The pruned phylogeny in alife standard format.

Parameters

phylogeny_dfpandas.DataFrame

The phylogeny as a dataframe in alife standard format.

Must represent an asexual phylogeny.

n_downsampleint, optional

Desired number of retained tips. If None, every distinct criterion_stratify value forms its own group.

mutatebool, default False

Are side effects on the input argument phylogeny_df allowed?

seedint, optional

Random seed for reproducible target-leaf selection when there are ties in criterion_target.

criterion_deltastr, default “origin_time”

Column name used to compute the off-lineage delta for each leaf. The delta is the absolute difference between a leaf’s value and its MRCA’s value in this column.

criterion_stratifystr, default “origin_time”

Column name used to stratify leaves into groups.

criterion_targetstr, default “origin_time”

Column name used to select the target leaf. The leaf with the largest value in this column is chosen as the target. Note that ties are broken by random sample, allowing a seed to be provided.

n_tips_per_stratumint, default 1

Number of tips to retain per stratified group. Must evenly divide n_downsample when n_downsample is not None.

progress_wrapCallable, optional

Pass tqdm or equivalent to display a progress bar.

Raises

ValueError: If criterion_delta, criterion_stratify, or criterion_target is not a column in phylogeny_df.
ValueError: If n_downsample is not None and n_tips_per_stratum does not evenly divide n_downsample.

Returns

pandas.DataFrame: The pruned phylogeny in alife standard format.

alifestd_downsample_tips_lineage_stratified_polars(phylogeny_df: ~polars.dataframe.frame.DataFrame, n_downsample: int | None = None, seed: int | None = None, *, criterion_delta: str | ~polars.expr.expr.Expr = 'origin_time', criterion_stratify: str | ~polars.expr.expr.Expr = 'origin_time', criterion_target: str | ~polars.expr.expr.Expr = 'origin_time', n_tips_per_stratum: int = 1, progress_wrap: ~typing.Callable = <function <lambda>>) → DataFrame

Retain leaves per stratified group, chosen by proximity to the lineage of a target leaf.

Selects a target leaf as the leaf with the largest criterion_target value (ties broken randomly). For each non-target leaf, the most recent common ancestor (MRCA) of that leaf and the target leaf is identified, and the “off-lineage delta” is computed as the absolute difference between that leaf’s criterion_delta value and the MRCA’s criterion_delta value.

Leaves are grouped by their criterion_stratify value. When n_downsample is an integer, stratified values are coarsened by ranking and integer-dividing to form exactly n_downsample // n_tips_per_stratum groups. When n_downsample is None, each distinct stratified value forms its own group (without ranking). Within each group, the n_tips_per_stratum leaves with the smallest off-lineage delta are retained.

Only supports asexual phylogenies.

Parameters

phylogeny_dfpolars.DataFrame or polars.LazyFrame

The phylogeny as a dataframe in alife standard format.

Must represent an asexual phylogeny.

n_downsampleint, optional

Desired number of retained tips. If None, every distinct criterion_stratify value forms its own group.

seedint, optional

Random seed for reproducible target-leaf selection when there are ties in criterion_target.

criterion_deltastr or polars.Expr, default “origin_time”

Column name or polars expression used to compute the off-lineage delta for each leaf. The delta is the absolute difference between a leaf’s value and its MRCA’s value.

criterion_stratifystr or polars.Expr, default “origin_time”

Column name or polars expression used to stratify leaves into groups.

criterion_targetstr or polars.Expr, default “origin_time”

Column name or polars expression used to select the target leaf. The leaf with the largest value is chosen as the target. Note that ties are broken by random sample, allowing a seed to be provided.

n_tips_per_stratumint, default 1

Number of tips to retain per stratified group. Must evenly divide n_downsample when n_downsample is not None.

progress_wrapCallable, optional

Pass tqdm or equivalent to display a progress bar.

Raises

NotImplementedError: If phylogeny_df has no “ancestor_id” column or if ids are non-contiguous or not topologically sorted.
ValueError: If criterion_delta, criterion_stratify, or criterion_target is not a column in phylogeny_df.
ValueError: If n_downsample is not None and n_tips_per_stratum does not evenly divide n_downsample.

Returns

polars.DataFrame: The pruned phylogeny in alife standard format.

Parameters

phylogeny_dfpolars.DataFrame

The phylogeny as a dataframe in alife standard format.

Must represent an asexual phylogeny.

n_downsampleint

Number of tips to retain.

seedint, optional

Integer seed for deterministic behavior.

Raises

NotImplementedError: If phylogeny_df has no “ancestor_id” column.

Returns

polars.DataFrame: The downsampled phylogeny in alife standard format.

Parameters

phylogeny_dfpolars.DataFrame

The phylogeny as a dataframe in alife standard format.

Must represent an asexual phylogeny.

n_downsampleint

Number of tips to retain.

seedint, optional

Integer seed for deterministic behavior.

Raises

NotImplementedError: If phylogeny_df has no “ancestor_id” column.

Returns

polars.DataFrame: The downsampled phylogeny in alife standard format.

Parameters

phylogeny_dfpandas.DataFrame: The phylogeny as a dataframe in alife standard format.
mutatebool, default False: Are side effects on the input argument allowed?
insertbool, default True: Drop columns sensitive to node insertion.
deletebool, default True: Drop columns sensitive to node deletion.
updatebool, default True: Drop columns sensitive to ancestor relationship updates.

Input dataframe is not mutated by this operation unless mutate set True. If mutate set True, operation does not occur in place; still use return value to get transformed phylogeny dataframe.

Parameters

phylogeny_dfpolars.DataFrame: The phylogeny as a dataframe in alife standard format.
insertbool, default True: Drop columns sensitive to node insertion.
deletebool, default True: Drop columns sensitive to node deletion.
updatebool, default True: Drop columns sensitive to ancestor relationship updates.

Parameters

first_dfpd.DataFrame

The DataFrame representing the first phylogenetic tree.

second_dfpd.DataFrame

The DataFrame representing the second phylogenetic tree.

taxon_label_keystr

The key in the DataFrame to identify the taxon labels.

confidencefloat, default 0.99

The confidence level for the estimation.

See estimate_binomial_p for details.

precisionfloat, default 0.01

The precision of the estimation.

See estimate_binomial_p for details.

strictbool or Tuple[bool, bool], default True

A flag or a tuple of flags indicating how to treat tuples.

If False, triplets that form a polytomy in either tree are not counted as mismatching. If True, they are counted as mismatching. If a tuple is given, polytomies in the first and second trees are treated according to the first and second elements of the tuple, respectively.

detailbool, default False

If True, returns a detailed result including the estimated distance, confidence interval, and sample size.

progress_wraptyping.Callable, optional

Pass tqdm or equivalent to display a progress bar.

mutatebool, default False

If True, allows mutation of input DataFrames.

Returns

float or Tuple[float, Tuple[float, float, int]]

The estimated distance between the two trees.

If detail is True, returns a tuple containing the estimated distance, the confidence interval, and the sample size.

Notes

The core comparison is done by sampling triplets of taxa, categorizing them, and comparing these categorizations across the two trees, taking into account the strict and lax parameters for handling polytomies. See alifestd_categorize_triplet_asexual for details.

Parameters

phylogeny_dfpolars.DataFrame or polars.LazyFrame

The phylogeny as a dataframe in alife standard format.

Must have contiguous ids and represent an asexual phylogeny.

Returns

numpy.ndarray: Array of leaf node ids.

Parameters

phylogeny_dfpd.DataFrame: Phylogeny in alife standard format.
firstint: First taxon id.
secondint: Second taxon id.
criterionstr, default “origin_time”: Column name used to measure distance between taxa and their MRCA.
mutatebool, default False: If True, allows in-place modification of phylogeny_df.

Returns

float or None: The pairwise distance between the two taxa, or None if they have no common ancestor.

Parameters

phylogeny_dfpolars.DataFrame or polars.LazyFrame

The phylogeny as a dataframe in alife standard format.

Must represent an asexual phylogeny with contiguous ids and topologically sorted rows.

firstint

First taxon id.

secondint

Second taxon id.

criterionstr or polars.Expr, default “origin_time”

Column name or polars expression used to measure distance between taxa and their MRCA.

Returns

float or None: The pairwise distance between the two taxa, or None if they have no common ancestor.

Parameters

phylogeny_dfpd.DataFrame: Phylogeny in alife standard format.
firstint: First taxon id.
secondint: Second taxon id.
mutatebool, default False: If True, allows in-place modification of phylogeny_df.
is_topologically_sortedbool, optional: If provided, skips the topological sort check. If None (default), the check is performed automatically.
has_contiguous_idsbool, optional: If provided, skips the contiguous ids check. If None (default), the check is performed automatically.

Returns

int or None: The id of the most recent common ancestor, or None if no common ancestor exists.

alifestd_find_pair_mrca_id_polars(phylogeny_df: DataFrame, first: int, second: int, *, is_topologically_sorted: bool | None = None, has_contiguous_ids: bool | None = None) → int | None

Find the Most Recent Common Ancestor of two taxa.

Parameters

phylogeny_dfpolars.DataFrame or polars.LazyFrame

The phylogeny as a dataframe in alife standard format.

Must represent an asexual phylogeny with contiguous ids and topologically sorted rows.

firstint

First taxon id.

secondint

Second taxon id.

is_topologically_sortedbool, optional

If provided, skips the topological sort check. If None (default), the check is performed automatically.

has_contiguous_idsbool, optional

If provided, skips the contiguous ids check. If None (default), the check is performed automatically.

Returns

int or None: The id of the most recent common ancestor, or None if no common ancestor exists.

Parameters

spop_textstr: Full text content of an Avida .spop file.
create_ancestor_listbool, default True: If True, include an ancestor_list column in the result.
dtype_idtype or None, default np.int64: Numpy dtype for the id column. If None, the smallest signed integer dtype is chosen automatically based on the number of rows in the data.

Returns

pd.DataFrame: Phylogeny dataframe in alife standard format.

Raises

ValueError: If the #format header is missing from the spop text.

alifestd_from_avida_spop_polars(spop_text: str, *, create_ancestor_list: bool = True, dtype_id: DataType | None = Int64) → DataFrame

Convert Avida .spop population snapshot text to a phylogeny dataframe.

Parses the text content of an Avida .spop (structured population) file and returns a polars DataFrame in alife standard format.

Parameters

spop_textstr: Full text content of an Avida .spop file.
create_ancestor_listbool, default True: If True, include an ancestor_list column in the result.
dtype_idpl.DataType or None, default pl.Int64: Polars dtype for the id column. If None, the smallest signed integer dtype is chosen automatically based on the number of rows in the data.

Returns

pl.DataFrame: Phylogeny dataframe in alife standard format.

Raises

ValueError: If the #format header is missing from the spop text.

alifestd_from_newick(newick: str, *, allow_forest: bool | None = None, branch_length_dtype: type = <class 'float'>, create_ancestor_list: bool = False, dtype_id: type | None = <class 'numpy.int64'>, replace_unquoted: ~typing.Mapping[str, str] = mappingproxy({})) → DataFrame

Convert a Newick format string to a phylogeny dataframe.

Parses a Newick tree string and returns a pandas DataFrame in alife standard format with columns: id, ancestor_id, taxon_label, origin_time_delta, and branch_length. Optionally includes ancestor_list.

Parameters

newickstr: A phylogeny in Newick format.
allow_forestbool or None, default None: Policy for a Newick string holding multiple ;-terminated trees (a forest). None parses the forest but warns; True parses it silently; False raises ValueError unless there is a single tree.
branch_length_dtypetype, default float: Dtype for branch length values. Use int to get nullable integer columns (pd.Int64Dtype). Missing branch lengths will be pd.NA for integer dtypes or NaN for float dtypes.
create_ancestor_listbool, default False: If True, include an ancestor_list column in the result.
dtype_idtype or None, default np.int64: Numpy dtype for the id and ancestor_id columns. If None, the smallest signed integer dtype that can hold all node ids is chosen automatically based on the node count of the Newick string.
replace_unquotedMapping[str, str], optional: Character substitutions to apply to unquoted taxon labels only, leaving quoted labels verbatim. Keys must be single characters. Pass {"_": " "} to follow the strict Newick convention in which an unquoted underscore denotes a space.

Returns

pd.DataFrame: Phylogeny dataframe in alife standard format.

Notes

By default, unquoted underscores in taxon labels are preserved literally; they are not converted to spaces. This diverges from the strict Newick convention (in which an unquoted _ denotes a space), but matches the round-trip behavior of alifestd_as_newick_asexual. Pass replace_unquoted={"_": " "} to follow the strict convention.

Parameters

newickstr: A phylogeny in Newick format.
allow_forestbool or None, default None: Policy for a Newick string holding multiple ;-terminated trees (a forest). None parses the forest but warns; True parses it silently; False raises ValueError unless there is a single tree.
branch_length_dtypetype, default float: Dtype for branch length values. Use int to get nullable integer columns (pl.Int64). Missing branch lengths will be null for integer dtypes or NaN for float dtypes.
create_ancestor_listbool, default False: If True, include an ancestor_list column in the result.
dtype_idpl.DataType or None, default pl.Int64: Polars dtype for the id and ancestor_id columns. If None, the smallest signed integer dtype that can hold all node ids is chosen automatically based on the node count of the Newick string.
replace_unquotedMapping[str, str], optional: Character substitutions to apply to unquoted taxon labels only, leaving quoted labels verbatim. Keys must be single characters. Pass {"_": " "} to follow the strict Newick convention in which an unquoted underscore denotes a space.

Returns

pl.DataFrame: Phylogeny dataframe in alife standard format.

Notes

By default, unquoted underscores in taxon labels are preserved literally; they are not converted to spaces. This diverges from the strict Newick convention (in which an unquoted _ denotes a space), but matches the round-trip behavior of alifestd_as_newick_asexual. Pass replace_unquoted={"_": " "} to follow the strict convention.

Parameters

phylogeny_dfpolars.DataFrame

The phylogeny as a dataframe in alife standard format.

Must represent an asexual phylogeny with contiguous ids and topologically sorted rows.

reversebool, default False

If True, sort descending (more leaves first).

Returns

polars.DataFrame: The phylogeny with rows reordered in ladderized order.

Raises

NotImplementedError: If ids are not contiguous or rows are not topologically sorted.

Parameters

depthint

Depth of the tree, where depth=1 is a single root node.

depth=0 -> empty tree (no nodes)
depth=1 -> 1 node (root only)
depth=2 -> 3 nodes (root + 2 leaves)
depth=3 -> 7 nodes (4 leaves)
depth=4 -> 15 nodes (8 leaves)

Returns

pd.DataFrame: Alife-standard phylogeny dataframe with ‘id’ and ‘ancestor_list’ columns.

Raises

ValueError: If depth is negative.

alifestd_make_balanced_bifurcating_polars(depth: int) → DataFrame

Build a perfectly balanced bifurcating tree of given depth.

Parameters

depthint: Depth of the tree, where depth=1 is a single root node.

Returns

pl.DataFrame: Phylogeny dataframe with ‘id’ and ‘ancestor_id’ columns.

alifestd_make_comb(n_leaves: int) → DataFrame

Build a comb/caterpillar tree with n_leaves leaves.

Structure (e.g., n_leaves=4):

Internal nodes: 0, 2, 4, … Leaves: 1, 3, 5, …

Parameters

n_leavesint: Number of leaf nodes in the resulting tree.

Returns

pd.DataFrame: Alife-standard phylogeny dataframe with ‘id’ and ‘ancestor_list’ columns.

Raises

ValueError: If n_leaves is negative.

alifestd_make_comb_polars(n_leaves: int) → DataFrame

Build a comb/caterpillar tree with n_leaves leaves.

Parameters

n_leavesint: Number of leaf nodes in the resulting tree.

Returns

pl.DataFrame: Phylogeny dataframe with ‘id’ and ‘ancestor_id’ columns.

alifestd_make_edge_split(n_leaves: int, seed: int | None = None) → DataFrame

Build a random bifurcating tree via edge-split (PDA) sampling.

At each step, a uniformly chosen existing edge is split by inserting a new internal node, with a new leaf attached as its sibling. This produces samples from the Proportional-to-Distinguishable-Arrangements (PDA) distribution over rooted bifurcating tree shapes.

Ids are contiguous but not topologically sorted; inserted internal nodes may have ids greater than some of their descendants. Pass the result through alifestd_topological_sort if topological id order is needed.

Parameters

n_leavesint: Number of leaf nodes in the resulting tree.
seedint, optional: Integer seed for deterministic behavior.

Returns

pd.DataFrame: Alife-standard phylogeny dataframe with ‘id’ and ‘ancestor_list’ columns.

Raises

ValueError: If n_leaves is negative.

alifestd_make_edge_split_polars(n_leaves: int, seed: int | None = None) → DataFrame

Build a random bifurcating tree via edge-split (PDA) sampling.

At each step, a uniformly chosen existing edge is split by inserting a new internal node, with a new leaf attached as its sibling. This produces samples from the Proportional-to-Distinguishable-Arrangements (PDA) distribution over rooted bifurcating tree shapes.

Ids are contiguous but not topologically sorted; inserted internal nodes may have ids greater than some of their descendants. Pass the result through alifestd_topological_sort_polars if topological id order is needed.

Parameters

n_leavesint: Number of leaf nodes in the resulting tree.
seedint, optional: Integer seed for deterministic behavior.

Returns

pl.DataFrame: Phylogeny dataframe with ‘id’ and ‘ancestor_id’ columns.

alifestd_make_empty(ancestor_id: bool = False) → DataFrame: Create an alife standard phylogeny dataframe with zero rows.

alifestd_make_empty_polars(ancestor_id: bool = True) → DataFrame: Create an alife standard phylogeny dataframe with zero rows.

alifestd_make_leaf_split(n_leaves: int, seed: int | None = None) → DataFrame

Build a random bifurcating tree via leaf-split (Yule) sampling.

At each step, a uniformly chosen leaf is replaced by an internal node with two new leaf children. This produces samples from the Yule (pure- birth) distribution over rooted bifurcating tree shapes.

Parameters

n_leavesint: Number of leaf nodes in the resulting tree.
seedint, optional: Integer seed for deterministic behavior.

Returns

pd.DataFrame: Alife-standard phylogeny dataframe with ‘id’ and ‘ancestor_list’ columns.

Raises

ValueError: If n_leaves is negative.

alifestd_make_leaf_split_polars(n_leaves: int, seed: int | None = None) → DataFrame

Build a random bifurcating tree via leaf-split (Yule) sampling.

At each step, a uniformly chosen leaf is replaced by an internal node with two new leaf children. This produces samples from the Yule (pure- birth) distribution over rooted bifurcating tree shapes.

Parameters

n_leavesint: Number of leaf nodes in the resulting tree.
seedint, optional: Integer seed for deterministic behavior.

Returns

pl.DataFrame: Phylogeny dataframe with ‘id’ and ‘ancestor_id’ columns.

alifestd_make_star(n_leaves: int) → DataFrame

Build a star tree with n_leaves leaves.

Structure (e.g., n_leaves=4):

   0
 / | \ \
1  2  3 4

The root (id 0) has every leaf as a direct child.

Parameters

n_leavesint: Number of leaf nodes in the resulting tree.

Returns

pd.DataFrame: Alife-standard phylogeny dataframe with ‘id’ and ‘ancestor_list’ columns.

Raises

ValueError: If n_leaves is negative.

alifestd_make_star_polars(n_leaves: int) → DataFrame

Build a star tree with n_leaves leaves.

Structure (e.g., n_leaves=4):

   0
 / | \ \
1  2  3 4

The root (id 0) has every leaf as a direct child.

Parameters

n_leavesint: Number of leaf nodes in the resulting tree.

Returns

pl.DataFrame: Phylogeny dataframe with ‘id’ and ‘ancestor_id’ columns.

alifestd_mark_ancestor_origin_time_asexual(phylogeny_df: DataFrame, mutate: bool = False, *, mark_as: str = 'ancestor_origin_time') → DataFrame

Add column ancestor_origin_time.

The output column name can be changed via the mark_as parameter.

Dataframe must provide column origin_time.

A topological sort will be applied if phylogeny_df is not topologically sorted. Dataframe reindexing (e.g., df.index) may be applied.

Input dataframe is not mutated by this operation unless mutate set True. If mutate set True, operation does not occur in place; still use return value to get transformed phylogeny dataframe.

alifestd_mark_ancestor_origin_time_polars(phylogeny_df: DataFrame, *, mark_as: str = 'ancestor_origin_time') → DataFrame

Add column ancestor_origin_time.

The output column name can be changed via the mark_as parameter.

Dataframe must provide column origin_time.

alifestd_mark_clade_duration_asexual(phylogeny_df: DataFrame, mutate: bool = False, *, mark_as: str = 'clade_duration') → DataFrame

Add column clade_duration, containing the difference between each the origin_time of each node and the maximum origin_time of its descendants.

The output column name can be changed via the mark_as parameter.

Leaf nodes will have duration 0.

Dataframe reindexing (e.g., df.index) may be applied.

Input dataframe is not mutated by this operation unless mutate set True. If mutate set True, operation does not occur in place; still use return value to get transformed phylogeny dataframe.

alifestd_mark_clade_duration_polars(phylogeny_df: DataFrame, *, mark_as: str = 'clade_duration') → DataFrame

Add column clade_duration, containing the difference between each node’s origin_time and the maximum origin_time of its descendants.

The output column name can be changed via the mark_as parameter.

Leaf nodes will have duration 0.

alifestd_mark_clade_duration_ratio_sister_asexual(phylogeny_df: DataFrame, mutate: bool = False, *, mark_as: str = 'clade_duration_ratio_sister') → DataFrame

Add column clade_duration_ratio_sister, containing the ratio of each clade’s duration to that of its sister.

The output column name can be changed via the mark_as parameter.

Root nodes will have ratio 1, unless also a leaf node. Leaf nodes and leaf-sisters may have ratio inf or NaN.

Tree must be strictly bifurcating.

Dataframe reindexing (e.g., df.index) may be applied.

Input dataframe is not mutated by this operation unless mutate set True. If mutate set True, operation does not occur in place; still use return value to get transformed phylogeny dataframe.

alifestd_mark_clade_duration_ratio_sister_polars(phylogeny_df: DataFrame, *, mark_as: str = 'clade_duration_ratio_sister') → DataFrame

Add column clade_duration_ratio_sister, containing the ratio of each clade’s duration to that of its sister.

The output column name can be changed via the mark_as parameter.

Tree must be strictly bifurcating.

alifestd_mark_clade_faithpd_asexual(phylogeny_df: DataFrame, mutate: bool = False, *, mark_as: str = 'clade_faithpd') → DataFrame

Add column clade_faithpd, containing sum branch length among descendant noes.

The output column name can be changed via the mark_as parameter.

Branch length is defined as the difference between the origin time of the node and the origin time of its ancestor.

A topological sort will be applied if phylogeny_df is not topologically sorted. Dataframe reindexing (e.g., df.index) may be applied.

Input dataframe is not mutated by this operation unless mutate set True. If mutate set True, operation does not occur in place; still use return value to get transformed phylogeny dataframe.

alifestd_mark_clade_faithpd_polars(phylogeny_df: DataFrame, *, mark_as: str = 'clade_faithpd') → DataFrame

Add column clade_faithpd, containing sum branch length among descendant nodes.

The output column name can be changed via the mark_as parameter.

alifestd_mark_clade_fblr_growth_children_asexual(phylogeny_df: ~pandas.core.frame.DataFrame, mutate: bool = False, *, mark_as: str = 'clade_fblr_growth_children', parallel_backend: str | None = None, progress_wrap: ~typing.Callable = <function <lambda>>, work_mask: ~numpy.ndarray | None = None) → DataFrame

Add column clade_fblr_growth_children, containing the coefficient of a fblr regression fit to origin times of the leaf descendants of each node.

The output column name can be changed via the mark_as parameter.

Nodes with left/right child clades with equal growth rates will have value approximately 0.0. If left child clade has greater growth rate, value will be negative. If right child clade has greater growth rate, value will be positive.

Pass “loky” to parallel_backend to use joblib with loky backend.

Leaf nodes will have value NaN. If provided, any nodes not included in work_mask will also have value NaN.

Tree must be strictly bifurcating and single-rooted.

Dataframe reindexing (e.g., df.index) may be applied.

Input phylogeny_df and work_mask are not mutated by this operation unless mutate set True. If mutate set True, operation does not occur in place; still use return value to get transformed phylogeny dataframe.

References

Bonetti Franceschi V and Volz E. Phylogenetic signatures reveal: multilevel selection and fitness costs in SARS-CoV-2 [version 2; peer review: 2 approved, 1 approved with reservations]. Wellcome Open Res 2024, 9:85 (https://doi.org/10.12688/wellcomeopenres.20704.2)
Volz, E. Fitness, growth and transmissibility of SARS-CoV-2 genetic: variants. Nat Rev Genet 24, 724-734 (2023). https://doi.org/10.1038/s41576-023-00610-z
Saran NA, Nar F. 2025. Fast binary logistic regression. PeerJ Computer: Science 11:e2579 https://doi.org/10.7717/peerj-cs.2579

alifestd_mark_clade_fblr_growth_sister_asexual(phylogeny_df: ~pandas.core.frame.DataFrame, mutate: bool = False, *, mark_as: str = 'clade_fblr_growth_sister', parallel_backend: str | None = None, progress_wrap: ~typing.Callable = <function <lambda>>, work_mask: ~numpy.ndarray | None = None) → DataFrame

Add column clade_fblr_growth_children, containing the coefficient of a fblr regression fit to origin times of this clade’s descendant leaves versus those of its sister clade.

The output column name can be changed via the mark_as parameter.

Clades with equal growth rate to their sister will have value approximately 0.0. Clades growing faster than their sister clade will have value greater than 0.0. Clades growing slower than their sister clade will have value less than 0.0.

Pass “loky” to parallel_backend to use joblib with loky backend.

Root nodes will have value NaN. If provided, any nodes not included in work_mask will also have value NaN.

Tree must be strictly bifurcating and single-rooted.

Dataframe reindexing (e.g., df.index) may be applied.

Input phylogeny_df and work_mask are not mutated by this operation unless mutate set True. If mutate set True, operation does not occur in place; still use return value to get transformed phylogeny dataframe.

References

Bonetti Franceschi V and Volz E. Phylogenetic signatures reveal: multilevel selection and fitness costs in SARS-CoV-2 [version 2; peer review: 2 approved, 1 approved with reservations]. Wellcome Open Res 2024, 9:85 (https://doi.org/10.12688/wellcomeopenres.20704.2)
Volz, E. Fitness, growth and transmissibility of SARS-CoV-2 genetic: variants. Nat Rev Genet 24, 724-734 (2023). https://doi.org/10.1038/s41576-023-00610-z
Saran NA, Nar F. 2025. Fast binary logistic regression. PeerJ Computer: Science 11:e2579 https://doi.org/10.7717/peerj-cs.2579

alifestd_mark_clade_leafcount_ratio_sister_asexual(phylogeny_df: DataFrame, mutate: bool = False, *, mark_as: str = 'clade_leafcount_ratio_sister') → DataFrame

Add column clade_leafcount_ratio_sister, containing the ratio of each clade’s leaf count to that of its sister.

The output column name can be changed via the mark_as parameter.

Root nodes will have ratio 1. Tree must be strictly bifurcating.

Dataframe reindexing (e.g., df.index) may be applied.

Input dataframe is not mutated by this operation unless mutate set True. If mutate set True, operation does not occur in place; still use return value to get transformed phylogeny dataframe.

alifestd_mark_clade_leafcount_ratio_sister_polars(phylogeny_df: DataFrame, *, mark_as: str = 'clade_leafcount_ratio_sister') → DataFrame

Add column clade_leafcount_ratio_sister, containing the ratio of each clade’s leaf count to that of its sister.

The output column name can be changed via the mark_as parameter.

Tree must be strictly bifurcating.

alifestd_mark_clade_logistic_growth_children_asexual(phylogeny_df: ~pandas.core.frame.DataFrame, mutate: bool = False, *, mark_as: str = 'clade_logistic_growth_children', parallel_backend: str | None = None, progress_wrap: ~typing.Callable = <function <lambda>>, work_mask: ~numpy.ndarray | None = None) → DataFrame

Add column clade_logistic_growth_children, containing the coefficient of a logistic regression fit to origin times of the leaf descendants of each node.

The output column name can be changed via the mark_as parameter.

Nodes with left/right child clades with equal growth rates will have value approximately 0.0. If left child clade has greater growth rate, value will be negative. If right child clade has greater growth rate, value will be positive.

Pass “loky” to parallel_backend to use joblib with loky backend.

Leaf nodes will have value NaN. If provided, any nodes not included in work_mask will also have value NaN.

Tree must be strictly bifurcating and single-rooted.

Dataframe reindexing (e.g., df.index) may be applied.

Input phylogeny_df and work_mask are not mutated by this operation unless mutate set True. If mutate set True, operation does not occur in place; still use return value to get transformed phylogeny dataframe.

References

Bonetti Franceschi V and Volz E. Phylogenetic signatures reveal: multilevel selection and fitness costs in SARS-CoV-2 [version 2; peer review: 2 approved, 1 approved with reservations]. Wellcome Open Res 2024, 9:85 (https://doi.org/10.12688/wellcomeopenres.20704.2)
Volz, E. Fitness, growth and transmissibility of SARS-CoV-2 genetic: variants. Nat Rev Genet 24, 724-734 (2023). https://doi.org/10.1038/s41576-023-00610-z

alifestd_mark_clade_logistic_growth_sister_asexual(phylogeny_df: ~pandas.core.frame.DataFrame, mutate: bool = False, *, mark_as: str = 'clade_logistic_growth_sister', parallel_backend: str | None = None, progress_wrap: ~typing.Callable = <function <lambda>>, work_mask: ~numpy.ndarray | None = None) → DataFrame

Add column clade_logistic_growth_children, containing the coefficient of a logistic regression fit to origin times of this clade’s descendant leaves versus those of its sister clade.

The output column name can be changed via the mark_as parameter.

Clades with equal growth rate to their sister will have value approximately 0.0. Clades growing faster than their sister clade will have value greater than 0.0. Clades growing slower than their sister clade will have value less than 0.0.

Pass “loky” to parallel_backend to use joblib with loky backend.

Root nodes will have value NaN. If provided, any nodes not included in work_mask will also have value NaN.

Tree must be strictly bifurcating and single-rooted.

Dataframe reindexing (e.g., df.index) may be applied.

Input phylogeny_df and work_mask are not mutated by this operation unless mutate set True. If mutate set True, operation does not occur in place; still use return value to get transformed phylogeny dataframe.

References

Bonetti Franceschi V and Volz E. Phylogenetic signatures reveal: multilevel selection and fitness costs in SARS-CoV-2 [version 2; peer review: 2 approved, 1 approved with reservations]. Wellcome Open Res 2024, 9:85 (https://doi.org/10.12688/wellcomeopenres.20704.2)
Volz, E. Fitness, growth and transmissibility of SARS-CoV-2 genetic: variants. Nat Rev Genet 24, 724-734 (2023). https://doi.org/10.1038/s41576-023-00610-z

alifestd_mark_clade_nodecount_ratio_sister_asexual(phylogeny_df: DataFrame, mutate: bool = False, *, mark_as: str = 'clade_nodecount_ratio_sister') → DataFrame

Add column clade_nodecount_ratio_sister, containing the ratio of each clade size to that of its sister.

The output column name can be changed via the mark_as parameter.

Root nodes will have ratio 1. Tree must be strictly bifurcating.

Dataframe reindexing (e.g., df.index) may be applied.

Input dataframe is not mutated by this operation unless mutate set True. If mutate set True, operation does not occur in place; still use return value to get transformed phylogeny dataframe.

alifestd_mark_clade_nodecount_ratio_sister_polars(phylogeny_df: DataFrame, *, mark_as: str = 'clade_nodecount_ratio_sister') → DataFrame

Add column clade_nodecount_ratio_sister, containing the ratio of each clade size to that of its sister.

The output column name can be changed via the mark_as parameter.

Tree must be strictly bifurcating.

alifestd_mark_clade_subtended_duration_asexual(phylogeny_df: DataFrame, mutate: bool = False, *, mark_as: str = 'clade_subtended_duration') → DataFrame

Add column clade_subtended_duration, containing the difference between each the origin_time of each node’s ancestor and the maximum origin_time of its descendants.

The output column name can be changed via the mark_as parameter.

Ancestor origin time for root nodes will be 0.

Dataframe reindexing (e.g., df.index) may be applied.

Input dataframe is not mutated by this operation unless mutate set True. If mutate set True, operation does not occur in place; still use return value to get transformed phylogeny dataframe.

alifestd_mark_clade_subtended_duration_polars(phylogeny_df: DataFrame, *, mark_as: str = 'clade_subtended_duration') → DataFrame

Add column clade_subtended_duration, containing the difference between each node’s ancestor’s origin_time and the maximum origin_time of its descendants.

The output column name can be changed via the mark_as parameter.

Ancestor origin time for root nodes will be 0.

alifestd_mark_clade_subtended_duration_ratio_sister_asexual(phylogeny_df: DataFrame, mutate: bool = False, *, mark_as: str = 'clade_subtended_duration_ratio_sister') → DataFrame

Add column clade_subtended_duration_ratio_sister, containing the ratio of each clade’s subtended duration to that of its sister.

The output column name can be changed via the mark_as parameter.

Root nodes will have ratio 1, unless also a leaf node. Leaf nodes and leaf-sisters may have ratio inf or NaN.

Tree must be strictly bifurcating.

Dataframe reindexing (e.g., df.index) may be applied.

Input dataframe is not mutated by this operation unless mutate set True. If mutate set True, operation does not occur in place; still use return value to get transformed phylogeny dataframe.

alifestd_mark_clade_subtended_duration_ratio_sister_polars(phylogeny_df: DataFrame, *, mark_as: str = 'clade_subtended_duration_ratio_sister') → DataFrame

Add column clade_subtended_duration_ratio_sister, containing the ratio of each clade’s subtended duration to that of its sister.

The output column name can be changed via the mark_as parameter.

Tree must be strictly bifurcating.

alifestd_mark_colless_index_asexual(phylogeny_df: DataFrame, mutate: bool = False, *, mark_as: str = 'colless_index') → DataFrame

Add column colless_index with Colless imbalance index for each subtree.

The output column name can be changed via the mark_as parameter.

Computes the classic Colless index for strictly bifurcating trees. For each internal node with exactly two children, the local contribution is |L - R| where L and R are leaf counts in left and right subtrees. The value at each node represents the total Colless index for the subtree rooted at that node.

Raises ValueError if the tree is not strictly bifurcating. For trees with polytomies, use alifestd_mark_colless_like_index_mdm_asexual for the Colless-like index instead.

Leaf nodes will have Colless index 0 (no imbalance in subtree of size 1). The root node contains the Colless index for the entire tree.

A topological sort will be applied if phylogeny_df is not topologically sorted. Dataframe reindexing (e.g., df.index) may be applied.

Input dataframe is not mutated by this operation unless mutate set True. If mutate set True, operation does not occur in place; still use return value to get transformed phylogeny dataframe.

Parameters

phylogeny_dfpd.DataFrame: Alife standard DataFrame containing the phylogenetic relationships.
mutatebool, optional: If True, modify the input DataFrame in place. Default is False.

Returns

pd.DataFrame: Phylogeny DataFrame with an additional column “colless_index” containing the Colless imbalance index for the subtree rooted at each node.

Raises

ValueError: If phylogeny_df is not strictly bifurcating.

Parameters

phylogeny_dfpd.DataFrame: Alife standard DataFrame containing the phylogenetic relationships.
mutatebool, optional: If True, modify the input DataFrame in place. Default is False.

Returns

pd.DataFrame: Phylogeny DataFrame with an additional column “colless_index_corrected” containing the corrected Colless imbalance index for the subtree rooted at each node.

Raises

ValueError: If phylogeny_df is not strictly bifurcating.

Parameters

phylogeny_dfpd.DataFrame: Alife standard DataFrame containing the phylogenetic relationships.
mutatebool, optional: If True, modify the input DataFrame in place. Default is False.

Returns

pd.DataFrame: Phylogeny DataFrame with an additional column “colless_like_index_mdm” containing the Colless-like imbalance index for the subtree rooted at each node.

References

Mir, A., Rossello, F., & Rotger, L. (2018). Sound Colless-like balance indices for multifurcating trees. PLOS ONE, 13(9), e0203401. https://doi.org/10.1371/journal.pone.0203401

Parameters

phylogeny_dfpd.DataFrame: Alife standard DataFrame containing the phylogenetic relationships.
mutatebool, optional: If True, modify the input DataFrame in place. Default is False.

Returns

pd.DataFrame: Phylogeny DataFrame with an additional column “colless_like_index_sd” containing the Colless-like imbalance index for the subtree rooted at each node.

References

Mir, A., Rossello, F., & Rotger, L. (2018). Sound Colless-like balance indices for multifurcating trees. PLOS ONE, 13(9), e0203401. https://doi.org/10.1371/journal.pone.0203401

Parameters

phylogeny_dfpd.DataFrame: Alife standard DataFrame containing the phylogenetic relationships.
mutatebool, optional: If True, modify the input DataFrame in place. Default is False.

Returns

pd.DataFrame: Phylogeny DataFrame with an additional column “colless_like_index_var” containing the Colless-like imbalance index for the subtree rooted at each node.

References

Mir, A., Rossello, F., & Rotger, L. (2018). Sound Colless-like balance indices for multifurcating trees. PLOS ONE, 13(9), e0203401. https://doi.org/10.1371/journal.pone.0203401

Parameters

phylogeny_dfpolars.DataFrame

The phylogeny as a dataframe in alife standard format.

Must represent an asexual phylogeny.

Returns

polars.DataFrame: The phylogeny with an added is_leaf boolean column.

Parameters

phylogeny_dfpolars.DataFrame or polars.LazyFrame

The phylogeny as a dataframe in alife standard format.

Must represent an asexual phylogeny.

valuesstr or polars.Expr

Column name or polars expression providing per-node values.

mark_asstr, default “lineage_cumsum”

Output column name.

reversebool, default False

If True, aggregate over clade rooted at each node.

skipnabool, default True

If True, NaN values are treated as identity (0); else propagate.

Parameters

phylogeny_dfpolars.DataFrame or polars.LazyFrame

The phylogeny as a dataframe in alife standard format.

Must represent an asexual phylogeny with contiguous ids and topologically sorted rows.

Returns

polars.DataFrame: The phylogeny with an added node_depth integer column.

Parameters

phylogeny_dfpolars.DataFrame

The phylogeny as a dataframe in alife standard format.

Must represent an asexual phylogeny.

Returns

polars.DataFrame: The phylogeny with an added num_descendants column.

Parameters

phylogeny_dfpd.DataFrame: Alife standard DataFrame containing the phylogenetic relationships.
mutatebool, optional: If True, modify the input DataFrame in place. Default is False.

Returns

pd.DataFrame: Phylogeny DataFrame with an additional column “num_leaves_sibling”

alifestd_mark_num_leaves_sibling_polars(phylogeny_df: DataFrame, *, mark_as: str = 'num_leaves_sibling') → DataFrame

Mark the number of leaves descendant from each node’s siblings.

The output column name can be changed via the mark_as parameter.

Nodes with no siblings (e.g., root nodes) will have value 0 marked.

alifestd_mark_num_preceding_leaves_asexual(phylogeny_df: DataFrame, mutate: bool = False, *, mark_as: str = 'num_preceding_leaves') → DataFrame

Add column num_preceding_leaves with count of all leaves occurring before the present node in an inorder traversal.

The output column name can be changed via the mark_as parameter.

For internal nodes, the number of leaf nodes prior to the traversal of first (i.e., leftmost) descendant is marked.

A topological sort will be applied if phylogeny_df is not topologically sorted. Dataframe reindexing (e.g., df.index) may be applied.

Must be a strictly bifurcating tree.

Input dataframe is not mutated by this operation unless mutate set True. If mutate set True, operation does not occur in place; still use return value to get transformed phylogeny dataframe.

alifestd_mark_num_preceding_leaves_polars(phylogeny_df: DataFrame, *, mark_as: str = 'num_preceding_leaves') → DataFrame

Add column num_preceding_leaves with count of all leaves occurring before the present node in an inorder traversal.

The output column name can be changed via the mark_as parameter.

alifestd_mark_oldest_root(phylogeny_df: DataFrame, mutate: bool = False, *, mark_as: str = 'is_oldest_root') → DataFrame

Point all other roots to oldest root, measured by lowest origin_time (if available) or otherwise lowest id.

The output column name can be changed via the mark_as parameter.

Input dataframe is not mutated by this operation unless mutate set True. If mutate set True, operation does not occur in place; still use return value to get transformed phylogeny dataframe.

alifestd_mark_oldest_root_polars(phylogeny_df: DataFrame, *, mark_as: str = 'is_oldest_root') → DataFrame

Point all other roots to oldest root, measured by lowest origin_time (if available) or otherwise lowest id.

The output column name can be changed via the mark_as parameter.

alifestd_mark_origin_time_delta_asexual(phylogeny_df: DataFrame, mutate: bool = False, *, mark_as: str = 'origin_time_delta') → DataFrame

Add columns origin_time_delta and ancestor_origin_time.

The output column name can be changed via the mark_as parameter.

Dataframe must provide column origin_time.

A topological sort will be applied if phylogeny_df is not topologically sorted. Dataframe reindexing (e.g., df.index) may be applied.

Input dataframe is not mutated by this operation unless mutate set True. If mutate set True, operation does not occur in place; still use return value to get transformed phylogeny dataframe.

alifestd_mark_origin_time_delta_polars(phylogeny_df: DataFrame, *, mark_as: str = 'origin_time_delta') → DataFrame

Add columns origin_time_delta and ancestor_origin_time.

The output column name can be changed via the mark_as parameter.

Dataframe must provide column origin_time.

alifestd_mark_ot_mrca_asexual(phylogeny_df: ~pandas.core.frame.DataFrame, mutate: bool = False, *, mark_as: str = 'ot_mrca', progress_wrap: ~typing.Callable = <function <lambda>>) → DataFrame

Appends columns characterizing the Most Recent Common Ancestor (MRCA) of the entire extant population at each taxon’s origin_time.

The output column name prefix can be changed via the mark_as parameter.

The extant population is defined in terms of active lineages: any branch of the tree existing at an origin_time which contains at least one descendant at or after that time.

New Columns:

ot_mrca_idint: The unique identifier of the MRCA for the population that was extant at this organism’s origin_time.
ot_mrca_time_ofint or float: The origin_time of that MRCA.
ot_mrca_time_sinceint or float: The duration elapsed between the MRCA’s origin_time and this taxon’s origin_time.

A chronological sort will be applied if phylogeny_df is not chronologically sorted. Dataframe reindexing (e.g., df.index) may be applied.

Input dataframe is not mutated by this operation unless mutate set True. If mutate set True, operation does not occur in place; still use return value to get transformed phylogeny dataframe.

alifestd_mark_ot_mrca_polars(phylogeny_df: DataFrame, *, mark_as: str = 'ot_mrca') → DataFrame

Appends columns characterizing the Most Recent Common Ancestor (MRCA) of the entire extant population at each taxon’s origin_time.

The output column name prefix can be changed via the mark_as parameter.

The extant population is defined in terms of active lineages: any branch of the tree existing at an origin_time which contains at least one descendant at or after that time.

New Columns

ot_mrca_idint: The unique identifier of the MRCA for the population that was extant at this organism’s origin_time.
ot_mrca_time_ofint or float: The origin_time of that MRCA.
ot_mrca_time_sinceint or float: The duration elapsed between the MRCA’s origin_time and this taxon’s origin_time.

Parameters

phylogeny_dfpolars.DataFrame

The phylogeny as a dataframe in alife standard format.

Must represent an asexual phylogeny with a single root.

Returns

polars.DataFrame: The phylogeny with added ot_mrca_id, ot_mrca_time_of, and ot_mrca_time_since columns.

Parameters

phylogeny_dfpd.DataFrame: Alife standard DataFrame containing the phylogenetic relationships.
mutatebool, optional: If True, modify the input DataFrame in place. Default is False.

Returns

pd.DataFrame: Phylogeny DataFrame with an additional column “sackin_index” containing the Sackin imbalance index for the subtree rooted at each node.

Parameters

phylogeny_dfpandas.DataFrame

The phylogeny as a dataframe in alife standard format.

Must represent an asexual phylogeny.

n_sampleint, optional

Number of tips to mark. If None, defaults to the count of leaves with the maximum criterion value.

mutatebool, default False

Are side effects on the input argument phylogeny_df allowed?

criterionstr, default “origin_time”

Column name used to rank leaves. The n_sample leaves with the largest values in this column are marked. Ties are broken arbitrarily.

mark_asstr, default “alifestd_mark_sample_tips_canopy_asexual”

Column name for the boolean mark.

Raises

ValueError: If criterion is not a column in phylogeny_df.

Returns

pandas.DataFrame: The phylogeny with an added boolean mark column.

alifestd_mark_sample_tips_canopy_polars(phylogeny_df: DataFrame, n_sample: int | None = None, criterion: str | Expr = 'origin_time', *, mark_as: str = 'alifestd_mark_sample_tips_canopy_polars') → DataFrame

Mark the n_sample leaves with the largest criterion values.

Adds a boolean column mark_as indicating retained tips.

If n_sample is None, it defaults to the number of leaves that share the maximum value of the criterion column. If n_sample is greater than or equal to the number of leaves in the phylogeny, all leaves are marked. Ties are broken arbitrarily.

Only supports asexual phylogenies.

Parameters

phylogeny_dfpolars.DataFrame

The phylogeny as a dataframe in alife standard format.

Must represent an asexual phylogeny.

n_sampleint, optional

Number of tips to mark. If None, defaults to the count of leaves with the maximum criterion value.

criterionstr or polars.Expr, default “origin_time”

Column name or polars expression used to rank leaves. The n_sample leaves with the largest values are marked. Ties are broken arbitrarily.

mark_asstr, default “alifestd_mark_sample_tips_canopy_polars”

Column name for the boolean mark.

Raises

NotImplementedError: If phylogeny_df has no “ancestor_id” column.
ValueError: If criterion is not a column in phylogeny_df.

Returns

polars.DataFrame: The phylogeny with an added boolean mark column.

Parameters

phylogeny_dfpolars.DataFrame

The phylogeny as a dataframe in alife standard format.

Must represent an asexual phylogeny.

n_sampleint

Number of tips to mark.

seedint, optional

Integer seed for deterministic behavior.

mark_asstr, default “alifestd_mark_sample_tips_clade_polars”

Column name for the boolean mark.

Raises

NotImplementedError: If phylogeny_df has no “ancestor_id” column or if ids are non-contiguous or not topologically sorted.

Returns

polars.DataFrame: The phylogeny with an added boolean mark column.

Parameters

phylogeny_dfpandas.DataFrame

The phylogeny as a dataframe in alife standard format.

Must represent an asexual phylogeny.

n_sampleint

Number of tips to mark.

mutatebool, default False

Are side effects on the input argument phylogeny_df allowed?

seedint, optional

Random seed for reproducible target-leaf selection when there are ties in criterion_target.

criterion_deltastr, default “origin_time”

Column name used to compute the off-lineage delta for each leaf.

criterion_targetstr, default “origin_time”

Column name used to select the target leaf.

progress_wrapCallable, optional

Pass tqdm or equivalent to display a progress bar.

mark_asstr, default “alifestd_mark_sample_tips_lineage_asexual”

Column name for the boolean mark.

Raises

ValueError: If criterion_delta or criterion_target is not a column in phylogeny_df.

Returns

pandas.DataFrame: The phylogeny with an added boolean mark column.

alifestd_mark_sample_tips_lineage_polars(phylogeny_df: ~polars.dataframe.frame.DataFrame, n_sample: int, seed: int | None = None, *, criterion_delta: str | ~polars.expr.expr.Expr = 'origin_time', criterion_target: str | ~polars.expr.expr.Expr = 'origin_time', progress_wrap: ~typing.Callable = <function <lambda>>, mark_as: str = 'alifestd_mark_sample_tips_lineage_polars') → DataFrame

Mark the n_sample leaves closest to the lineage of a target leaf.

Adds a boolean column mark_as indicating retained tips.

Parameters

phylogeny_dfpolars.DataFrame or polars.LazyFrame

The phylogeny as a dataframe in alife standard format.

Must represent an asexual phylogeny.

n_sampleint

Number of tips to mark.

seedint, optional

Random seed for reproducible target-leaf selection.

criterion_deltastr or polars.Expr, default “origin_time”

Column name or polars expression used to compute the off-lineage delta for each leaf.

criterion_targetstr or polars.Expr, default “origin_time”

Column name or polars expression used to select the target leaf.

progress_wrapCallable, optional

Pass tqdm or equivalent to display a progress bar.

mark_asstr, default “alifestd_mark_sample_tips_lineage_polars”

Column name for the boolean mark.

Raises

NotImplementedError: If phylogeny_df has no “ancestor_id” column or if ids are non-contiguous or not topologically sorted.
ValueError: If criterion_delta or criterion_target is not a column in phylogeny_df.

Returns

polars.DataFrame: The phylogeny with an added boolean mark column.

Parameters

phylogeny_dfpandas.DataFrame

The phylogeny as a dataframe in alife standard format.

Must represent an asexual phylogeny.

n_sampleint, optional

Desired number of retained tips. If None, every distinct criterion_stratify value forms its own group.

mutatebool, default False

Are side effects on the input argument phylogeny_df allowed?

seedint, optional

Random seed for reproducible target-leaf selection.

criterion_deltastr, default “origin_time”

Column name used to compute the off-lineage delta for each leaf.

criterion_stratifystr, default “origin_time”

Column name used to stratify leaves into groups.

criterion_targetstr, default “origin_time”

Column name used to select the target leaf.

n_tips_per_stratumint, default 1

Number of tips to retain per stratified group.

progress_wrapCallable, optional

Pass tqdm or equivalent to display a progress bar.

mark_asstr, default “alifestd_mark_sample_tips_lineage_stratified_asexual”

Column name for the boolean mark.

Raises

ValueError: If criterion_delta, criterion_stratify, or criterion_target is not a column in phylogeny_df.
ValueError: If n_sample is not None and n_tips_per_stratum does not evenly divide n_sample.

Returns

pandas.DataFrame: The phylogeny with an added boolean mark column.

alifestd_mark_sample_tips_lineage_stratified_polars(phylogeny_df: ~polars.dataframe.frame.DataFrame, n_sample: int | None = None, seed: int | None = None, *, criterion_delta: str | ~polars.expr.expr.Expr = 'origin_time', criterion_stratify: str | ~polars.expr.expr.Expr = 'origin_time', criterion_target: str | ~polars.expr.expr.Expr = 'origin_time', n_tips_per_stratum: int = 1, progress_wrap: ~typing.Callable = <function <lambda>>, mark_as: str = 'alifestd_mark_sample_tips_lineage_stratified_polars') → DataFrame

Mark leaves per stratified group, chosen by proximity to the lineage of a target leaf.

Adds a boolean column mark_as indicating retained tips.

Only supports asexual phylogenies.

Parameters

phylogeny_dfpolars.DataFrame or polars.LazyFrame

The phylogeny as a dataframe in alife standard format.

Must represent an asexual phylogeny.

n_sampleint, optional

Desired number of retained tips. If None, every distinct criterion_stratify value forms its own group.

seedint, optional

Random seed for reproducible target-leaf selection.

criterion_deltastr or polars.Expr, default “origin_time”

Column name or polars expression used to compute the off-lineage delta for each leaf.

criterion_stratifystr or polars.Expr, default “origin_time”

Column name or polars expression used to stratify leaves into groups.

criterion_targetstr or polars.Expr, default “origin_time”

Column name or polars expression used to select the target leaf.

n_tips_per_stratumint, default 1

Number of tips to retain per stratified group.

progress_wrapCallable, optional

Pass tqdm or equivalent to display a progress bar.

mark_asstr, default “alifestd_mark_sample_tips_lineage_stratified_polars”

Column name for the boolean mark.

Raises

NotImplementedError: If phylogeny_df has no “ancestor_id” column or if ids are non-contiguous or not topologically sorted.
ValueError: If criterion_delta, criterion_stratify, or criterion_target is not a column in phylogeny_df.
ValueError: If n_sample is not None and n_tips_per_stratum does not evenly divide n_sample.

Returns

polars.DataFrame: The phylogeny with an added boolean mark column.

Parameters

phylogeny_dfpolars.DataFrame

The phylogeny as a dataframe in alife standard format.

Must represent an asexual phylogeny.

n_sampleint

Number of tips to mark.

seedint, optional

Integer seed for deterministic behavior.

mark_asstr, default “alifestd_mark_sample_tips_polars”

Column name for the boolean mark.

Raises

NotImplementedError: If phylogeny_df has no “ancestor_id” column.

Returns

polars.DataFrame: The phylogeny with an added boolean mark column.

Parameters

phylogeny_dfpolars.DataFrame

The phylogeny as a dataframe in alife standard format.

Must represent an asexual phylogeny.

n_sampleint

Number of tips to mark.

seedint, optional

Integer seed for deterministic behavior.

mark_asstr, default “alifestd_mark_sample_tips_uniform_polars”

Column name for the boolean mark.

Raises

NotImplementedError: If phylogeny_df has no “ancestor_id” column.

Returns

polars.DataFrame: The phylogeny with an added boolean mark column.

Parameters

phylogeny_dfpolars.DataFrame

The phylogeny as a dataframe in alife standard format.

Must represent an asexual phylogeny with contiguous, topologically sorted ids and an ancestor_id column.

ancestor_masknumpy.ndarray

Boolean array indicating ancestor nodes to propagate from.

Raises

NotImplementedError: If phylogeny_df has no “ancestor_id” column or if ids are non-contiguous or not topologically sorted.

Returns

polars.DataFrame: The input DataFrame with an additional boolean column alifestd_mask_descendants_polars.

alifestd_mask_monomorphic_clades_asexual(phylogeny_df: DataFrame, mutate: bool = False, *, trait_mask: ndarray, trait_values: ndarray) → DataFrame

Compute a mask marking “monomorphic” clades where all members with a trait defined value share the same trait value.

Clades containing no members with a defined trait value are considered monomorphic. All leaf nodes are considered monomorphic.

Parameters

phylogeny_dfpd.DataFrame: DataFrame containing the phylogeny, including an ancestor_id column.
mutatebool, default=False: If False, operates on a copy of phylogeny_df; if True, modifies phylogeny_df in place (but still returns it).
trait_masknp.ndarray: Boolean array marking the nodes that have a defined trait value, aligned with phylogeny_df.index.
trait_valuesnp.ndarray: Array of trait values aligned with phylogeny_df.index.

Returns

pd.DataFrame

alifestd_parse_ancestor_id(ancestor_list_str: str) → int | None: Parse at most a single ancestor id from an ancestor_list field.

alifestd_parse_ancestor_ids(ancestor_list_str: str) → List[int]: Parse ancestor ids from an ancestor_list field.

alifestd_pipe_unary_ops(phylogeny_df: ~pandas.core.frame.DataFrame, *unary_ops: ~typing.Callable[[~pandas.core.frame.DataFrame], ~pandas.core.frame.DataFrame], progress_wrap: ~typing.Callable = <function <lambda>>) → DataFrame

Pipe a phylogeny DataFrame through a sequence of unary operations.

Each operation in unary_ops is applied in order to the DataFrame.

Parameters

phylogeny_dfpandas.DataFrame: The phylogeny as a dataframe in alife standard format.
*unary_opscallable: Zero or more callables, each accepting and returning a DataFrame.
progress_wrapcallable, optional: Optional wrapper for unary_ops to provide progress feedback (e.g. tqdm).

Returns

pandas.DataFrame: The result of piping phylogeny_df through each operation in order.

Parameters

phylogeny_dfpolars.DataFrame: The phylogeny as a dataframe in alife standard format.
*unary_opscallable: Zero or more callables, each accepting and returning a DataFrame.
progress_wrapcallable, optional: Optional wrapper for unary_ops to provide progress feedback (e.g. tqdm).

Returns

polars.DataFrame: The result of piping phylogeny_df through each operation in order.

Parameters

phylogeny_dfpandas.DataFrame

The phylogeny as a dataframe in alife standard format.

Must represent an asexual phylogeny.

mutatebool, default False

Are side effects on the input argument phylogeny_df allowed?

criterionstr, default “extant”

Column name used to determine extant taxa.

Raises

ValueError: If criterion is not a column in phylogeny_df.

Returns

pandas.DataFrame: The pruned phylogeny in alife standard format.

alifestd_prune_extinct_lineages_polars(phylogeny_df: DataFrame, *, criterion: str = 'extant') → DataFrame

Drop taxa without extant descendants.

The criterion column is used to determine extant taxa.

Parameters

phylogeny_dfpolars.DataFrame

The phylogeny as a dataframe in alife standard format.

Must represent an asexual phylogeny.

criterionstr, default “extant”

Column name used to determine extant taxa.

Raises

NotImplementedError: If phylogeny_df has no “ancestor_id” column.
NotImplementedError: If phylogeny_df has non-contiguous ids.
NotImplementedError: If phylogeny_df is not topologically sorted.
ValueError: If criterion is not a column in phylogeny_df.

Returns

polars.DataFrame: The pruned phylogeny in alife standard format.

Parameters

phylogeny_dfpandas.DataFrame

The phylogeny as a dataframe in alife standard format.

Must represent an asexual phylogeny.

new_root_idint

The ID of the node to use as the new root of the phylogeny.

mutatebool, default False

Are side effects on the input argument phylogeny_df allowed?

Returns

pandas.DataFrame: The rerooted phylogeny in alife standard format.

alifestd_reroot_at_id_polars(phylogeny_df: DataFrame, new_root_id: int) → DataFrame

Reroot phylogeny at specified node id, preserving topology.

Reverses the descendant-to-ancestor relationships of all ancestors of the new root. Does not update branch_lengths or edge_lengths columns if present.

Parameters

phylogeny_dfpolars.DataFrame

The phylogeny as a dataframe in alife standard format.

Must represent an asexual phylogeny.

new_root_idint

The ID of the node to use as the new root of the phylogeny.

Returns

polars.DataFrame: The rerooted phylogeny in alife standard format.

alifestd_sample_triplet_comparisons_asexual(first_df: ~pandas.core.frame.DataFrame, second_df: ~pandas.core.frame.DataFrame, taxon_label_key: str, n: int = 1000, progress_wrap: ~typing.Callable = <function <lambda>>, mutate: bool = False) → DataFrame

Sample triplet comparisons between two asexual phylogenetic trees in alife standard form, creating a DataFrame with the triplet categorizations and comparison results as well as corresponding data from MRCA row within the first tree.

The MRCA row corresponds to the most recent common ancestor of two of the three taxa in the triplet.

Parameters

first_dfpd.DataFrame

The DataFrame representing the first phylogenetic tree.

second_dfpd.DataFrame

The DataFrame representing the second phylogenetic tree.

taxon_label_keystr

The key in the DataFrame to identify the taxon labels.

nint, default 1000

The number of samples to take.

Corresponds to number of rows in the returned DataFrame.

progress_wraptyping.Callable, optional

Pass tqdm or equivalent to display a progress bar.

mutatebool, default False

If True, allows mutation of input DataFrames.

Returns

pd.DataFrame

A DataFrame with rows corresponding to sampled triplet comparisons and the following columns: - “triplet code, {first,second}”: the categorization of the triplet in

the first or second tree.

“triplet match, {lax,lax/strict,strict,strict/lax}”: whether the triplet categorizations match with differing treatment of polytomies.
all columns from the first tree.

Notes

The core comparison is done by sampling triplets of taxa, categorizing them, and comparing these categorizations across the two trees, taking into account the strict and lax parameters for handling polytomies. See alifestd_categorize_triplet_asexual for details.

Parameters

phylogeny_dfpandas.DataFrame: The phylogeny as a dataframe in alife standard format.
criterionstr: Name of the column to sort children by.
reversebool, default False: If True, sort descending (higher values first).
mutatebool, default False: If True, allow mutation of the input dataframe.

Returns

pandas.DataFrame: The phylogeny with rows reordered by sorted children traversal.

Parameters

phylogeny_dfpolars.DataFrame

The phylogeny as a dataframe in alife standard format.

Must represent an asexual phylogeny with contiguous ids and topologically sorted rows.

criterionstr or polars.Expr

Name of the column to sort children by, or a polars expression whose values determine the sort order.

reversebool, default False

If True, sort descending (higher values first).

Returns

polars.DataFrame: The phylogeny with rows reordered by sorted children traversal.

Raises

NotImplementedError: If ids are not contiguous or rows are not topologically sorted.

Parameters

phylogeny_dfpd.DataFrame: Asexual phylogeny in alife standard format with contiguous ids and topologically sorted rows.
mutatebool, default False: If True, allow modification of the input dataframe.

Returns

AlifestdIplotxShimPandas: An iplotx-compatible tree provider that can be passed directly to iplotx.tree().

alifestd_to_iplotx_polars(phylogeny_df: DataFrame) → AlifestdIplotxShimPolars

Wrap a polars phylogeny DataFrame for use with iplotx.

Parameters

phylogeny_dfpolars.DataFrame: Asexual phylogeny in alife standard format with contiguous ids and topologically sorted rows.

Returns

AlifestdIplotxShimPolars: An iplotx-compatible tree provider that can be passed directly to iplotx.tree().

alifestd_to_working_format(phylogeny_df: DataFrame, mutate: bool = False) → DataFrame

Re-encode phylogeny_df to facilitate efficient analysis and transformation operations.

The returned phylogeny dataframe will * be topologically sorted (i.e., organisms appear after all ancestors), * have contiguous ids (i.e., organisms’ ids correspond to row number), * contain an integer datatype ancestor_id column if the phylogeny is asexual (i.e., a more performant representation of ancestor_list).

Input dataframe is not mutated by this operation unless mutate set True. If mutate set True, operation does not occur in place; still use return value to get transformed phylogeny dataframe.

alifestd_to_working_format_polars(phylogeny_df: DataFrame, keep_ancestor_list: bool = False) → DataFrame

Re-encode phylogeny_df to facilitate efficient analysis and transformation operations.

The returned phylogeny dataframe will * be topologically sorted (i.e., organisms appear after all ancestors), * have contiguous ids (i.e., organisms’ ids correspond to row number), * contain an integer datatype ancestor_id column if the phylogeny is asexual (i.e., a more performant representation of ancestor_list).

Parameters

phylogeny_dfpolars.DataFrame: The phylogeny as a dataframe in alife standard format.
keep_ancestor_listbool, default False: If True and ancestor_list was present in the input, regenerate the ancestor_list column from the (reassigned) ancestor_id column. The column is dropped during processing in all cases; it is only restored when this flag is set and the input already had it.

Parameters

insertbool: Whether the operation inserts new nodes.
deletebool: Whether the operation deletes nodes.
updatebool: Whether the operation updates ancestor relationships.

Returns

typing.Callable: A decorator that wraps a function with topological sensitivity warning logic.

Parameters

insertbool: Whether the operation inserts new nodes.
deletebool: Whether the operation deletes nodes.
updatebool: Whether the operation updates ancestor relationships.

Returns

typing.Callable: A decorator that wraps a function with topological sensitivity warning logic.

Notes

Even allowed by mutate flag, no side effects occur on input dataframe under Polars implementation. Flag is included for API compatibility with Pandas implementation.

Parameters

phylogeny_dfpolars.DataFrame or polars.LazyFrame

The phylogeny as a dataframe in alife standard format.

Must represent an asexual, strictly bifurcating phylogeny with contiguous ids and topologically sorted rows.

Returns

np.ndarray: Index array giving inorder traversal order.

Parameters

phylogeny_dfpolars.DataFrame or polars.LazyFrame

The phylogeny as a dataframe in alife standard format.

Must represent an asexual phylogeny with contiguous ids and topologically sorted rows.

Returns

np.ndarray: Index array giving levelorder (BFS) traversal order.

Parameters

phylogeny_dfpd.DataFrame: Asexual phylogeny in alife standard format with contiguous ids and topologically sorted rows.
mutatebool, default False: If True, allow modification of the input dataframe.
child_order{“asc”, “desc”, None}, default None: Order in which siblings are visited when descending the tree. "asc" visits smallest-id child first, "desc" visits largest-id child first, and None uses an arbitrary (implementation-defined) order.

alifestd_unfurl_traversal_postorder_contiguous_polars(phylogeny_df: DataFrame, child_order: Literal['asc', 'desc'] | None = None) → ndarray

List node indices in DFS postorder traversal order, with subtree contiguity.

Parameters

phylogeny_dfpolars.DataFrame or polars.LazyFrame

The phylogeny as a dataframe in alife standard format.

Must represent an asexual phylogeny with contiguous ids and topologically sorted rows.

child_order{“asc”, “desc”, None}, default None

Order in which siblings are visited when descending the tree. "asc" visits smallest-id child first, "desc" visits largest-id child first, and None uses an arbitrary (implementation-defined) order.

Returns

np.ndarray: Index array giving DFS postorder traversal order.

Parameters

phylogeny_dfpolars.DataFrame or polars.LazyFrame

The phylogeny as a dataframe in alife standard format.

Must represent an asexual phylogeny with contiguous ids and topologically sorted rows.

Returns

np.ndarray: Index array giving postorder traversal order.

Parameters

phylogeny_dfpolars.DataFrame or polars.LazyFrame

The phylogeny as a dataframe in alife standard format.

Must represent an asexual phylogeny with contiguous ids and topologically sorted rows.

Returns

np.ndarray: Index array giving DFS preorder traversal order.

Parameters

phylogeny_dfpolars.DataFrame or polars.LazyFrame

The phylogeny as a dataframe in alife standard format.

Must represent an asexual, strictly bifurcating phylogeny with contiguous ids and topologically sorted rows.

Returns

np.ndarray: Index array giving semiorder traversal order.

Parameters

phylogeny_dfpolars.DataFrame or polars.LazyFrame

The phylogeny as a dataframe in alife standard format.

Must represent an asexual phylogeny with contiguous ids.

Returns

np.ndarray: Index array giving topological traversal order.

Parameters

phylogeny_dfpandas.DataFrame: The phylogeny as a dataframe in alife standard format.
callerstr: Name of the calling function, included in the warning message.
insertbool: Whether the operation inserts new nodes.
deletebool: Whether the operation deletes nodes.
updatebool: Whether the operation updates ancestor relationships.

Input dataframe is not mutated by this operation.

Parameters

phylogeny_dfpolars.DataFrame or polars.LazyFrame: The phylogeny as a dataframe in alife standard format.
callerstr: Name of the calling function, included in the warning message.
insertbool: Whether the operation inserts new nodes.
deletebool: Whether the operation deletes nodes.
updatebool: Whether the operation updates ancestor relationships.

legacy

Parameters

Parameters

Parameters

Parameters

Returns

Raises

Parameters

Returns

See Also

Parameters

Returns

See Also

See Also

Parameters

Parameters

See Also

Parameters

Returns

See Also

Parameters

Returns

See Also

Parameters

Returns

See Also

Parameters

Returns

See Also

See Also

Parameters

See Also

Parameters

See Also

Parameters

Returns

Raises

Parameters

Returns

Raises

See Also

See Also

Parameters

Returns

See Also

See Also

See Also

See Also

See Also

Parameters

Raises

Returns

Parameters

Raises

Returns

See Also

Parameters

Raises

Returns

See Also

Parameters

Raises

Returns

Parameters

Raises

Returns

See Also

Parameters

Raises

Returns

Parameters

Raises

Returns

See Also

Parameters

Raises

Returns

See Also

Parameters

Raises