Insane tree data access and processing functions
Basic tree information
Basic information about the tree, such as the number tips, the number of extinct tips, the number of fossils, the tree height (duration of the tree) and the tree length (sum of all branch lengths) can be performed using, respectively:
ntips(tree)
ntipsextinct(tree)
nfossils(tree)
treeheight(tree)
treelength(tree)Tree vector statistics
Julia makes it simple to look at statistics across a vector of trees. For example, using the package Statistics, we can estimate the average number of extinct species on tree vector tv by simply:
using Statistics
mean(ntipsextinct, tv)See the documentation of mean for more details, but basically, mean, and many other functions in Julia allow to perform a undefined function on each element before calculating the mean. In this case, we are estimating the number of extinct tips in each tree in tv, and then averaging over them.
Tree labels and obtaining subtrees
For labelled trees, one can extract the tip labels using tiplabels. Moreover we can create subclades based on a vector of tips, where the subclade will be the minimum tree that has all the tips. For example, for a vector tip_vector holding Strings that correspond to the tip labels in the tree of type sT_label, we can use
subclade(tree, tip_vector)However, most time we want to extract subclades of other types of trees. Since these do not hold label information but should be ordered in the same order as the sT_label tree, one has to use both.
if you change the order of either the sT_label tree or the single of vector of trees of other types, this will not work.
Thus, if we want a the subclades that have the tips in tip_vector, we can use
subclade(tv, tree, tip_vector, true)where the last argument states if returning the stem or crown tree.
Lineage and Diversity through time (LTT & DTT)
One can also estimate the Lineage Through Time (LTT), or, for a data augmented tree (or a vector of trees), the Diversity Through Time (DTT) using
ltt(tree)To be clear, the LTT is usually used to describe the accumulation of reconstructed lineages (those that have been sampled) while DTT is used to describe estimated diversity (sampled and unsampled lineages). Thus, the result is either LTT or DTT simply depending on the tree you use as input.
We can also estimate the ltt for a tree vector tv using
ltt(tv)Trees with diffusion information (e.g., BDD, FBDD, DBM)
Estimating posterior average rates along the tree
Of particular interest is the estimation of posterior average rates along the reconstructed tree. Since the data augmented (unsampled) lineages change between different iterations of the algorithm, we obtain lineage-specific instantaneous rate distributions only for the reconstructed (observed) part of the trees (the tree we used as input). Consequently, we first need to remove the data augmented lineages from all the trees in the posterior tree vector:
tv0 = remove_unsampled(tv)We can then estimate the average tree using
tm = imean(tv0)We can also estimate any quantile tree, for instance, for the $0.25$ quantile tree:
t025 = iquantile(tv0, 0.25)Clearly, these resulting trees can then be further scrutinized as with any other tree in INSANE.
Attribute wrappers
For convenience, Tapestree provides the following tree attribute wrappers:
- birth: To obtain speciation rates (i.e.,
x -> exp.(lλ(x))) - logbirth: To obtain the logarithm of speciation rates (i.e.,
x -> lλ(x)) - death: To obtain extinction rates (i.e.,
x -> exp.(lμ(x))) - logdeath: To obtain the logarithm of extinction rates (i.e.,
x -> lμ(x)) - turnover: To obtain turnover rates (i.e.,
x -> exp.(lμ(x) .- lλ(x))) - diversification: To obtain speciation rates (i.e.,
x -> exp.(lλ(x)) .- exp.(lμ(x))) - trait: To obtain speciation rates (i.e.,
x -> xv(x)) - logtrait: To obtain speciation rates (i.e.,
x -> log.(xv(x))) - traitrate: To obtain speciation rates (i.e.,
x -> exp.(lσ2(x))) - logtraitrate: To obtain speciation rates (i.e.,
x -> lσ2(x))
Other data access and averaging functions
The value of function f at the tips of the tree and any fossil samples can be obtained using the tipget function. For example, to obtain the speciation rates for sampled species from a data augmented tree treeda (any tree output when running inference), use
tipget(treeda, tree, birth)where tree is the labelled tree used as input (of type sT_label or sTf_label). This function returns a dictionary of labels pointing to the specific value returned by f.
A common need is to obtain the posterior value of function f for each species. This can be done by first Estimating posterior average rates along the tree, and, assuming the resulting psoterior average tree is named tm, then using
tipget(tm, tree, f)to get any attribute returned by f (e.g., speciation rates, extinction rates, traits, trait rates, etc., see Attribute wrappers for functions)
If one wants to obtain the range (i.e., extrema) of the output of function f on tree, for example, the maximum and minimum speciation rates:
irange(tree, birth)If one wants to sample, recursively, some function at regular intervals along a tree, one can use sample. For example if we want to sample speciation rates every $0.1$ time units, we can use
sample(tv, birth, 0.1)Here we are sampling along each branch of the tree in recursive order, not sampling across lineages through time.
If we would like to extract an array across lineages in a given tree of the output of function f, we would use time_rate. For example, if we want the cross-lineage extinction rates of a tree of type iTbd sampled every $0.5$ time units, we would use
time_rate(tv, death, 0.5)which returns a vector of vectors, where each element is a time holding the rates (in this case extinction rates) of all contemporary lineages at that time.
Finally, a convenience wrapper to extract information recursively from a tree is trextract. For example, if we want all branch lengths for a tree, we can use
trextract(tree, e)Below are some functions to obtain data from trees.
Full documentation
Tapestree.INSANE.tiplabels — Functiontiplabels(tree::T) where {T <: Tlabel}Returns tip labels for sT_label and sTf_label.
Tapestree.INSANE.ntips — Functionntips(tree::T) where {T <: iTree}Return the number of tip nodes for tree.
Tapestree.INSANE.ntipsalive — Functionntipsalive(tree::T) where {T <: iTree}Return the number of alive nodes for tree.
Tapestree.INSANE.ntipsextinct — Functionntipsextinct(tree::T) where {T <: iTree}Return the number of extinct nodes for tree.
ntipsextinct(Ξ::Vector{T}) where {T <: iTree}Return the number of extinct nodes in Ξ.
Tapestree.INSANE.treeheight — Functiontreeheight(tree::T) where {T <: iTree}Return the tree height of tree.
treeheight(tree::T) where {T <: Union{iTf, iTpbd}}Return the tree height of tree.
treeheight(tree::T, nd::Int64) where {T <: iTree}Return the tree height of tree.
treeheight(tree::T, nd::Int64) where {T <: Union{iTf, iTpbd}}Return the tree height of tree.
Tapestree.INSANE.treelength — Functiontreelength(tree::T) where {T <: iTree}Return the branch length sum of tree.
treelength(tree::T, ets::Vector{Float64}) where {T <: Union{iTf, iTpbd}}Return the branch length sum of tree at different epochs, initialized at l.
treelength(Ξ::Vector{T}) where {T <: iTree}Return the branch length sum of Ξ.
treelength(Ξ ::Vector{T},
ets::Vector{Float64},
bst::Vector{Float64},
eix::Vector{Int64}) where {T <: iTf}Return the branch length sum of tree at different epochs, initialized at l.
Tapestree.INSANE.ltt — Functionltt(tree::T) where {T <: iTree}Returns number of species through time.
ltt(tree::Vector{T}) where {T <: iTree}Returns number of species through time for a tree vector.
ltt(tree::T, tor::Float64) where {T <: iTree}Returns number of species through time for a tree vector.
Tapestree.INSANE.iscrowntree — Functioniscrowntree(tree::T) where {T <: iTree}Return if the tree is a crown tree.
Tapestree.INSANE.irange — Functionirange(tree::T, f::Function) where {T <: iTree}Return the extrema of the output of function f on tree.
Tapestree.INSANE.tipget — Functiontipget(treeda::T, tree::D, f::Function) where {T <: iTree, D <: Tlabel}Return function f for tips or fossils in treeda with labels from tree.
Tapestree.INSANE.time_rate — Functiontime_rate(tree::T, f::Function, δt::Float64) where {T <: iT}Extract values from f function at times sampled every δt across the tree.
Tapestree.INSANE.trextract — Functiontrextract(tree::iTree, f::Function)Perform function f in each recursive tree in tree.
Tapestree.INSANE.subclade — Functionsubclade(tree::iTree, ix::Int64)Return the minimum stem subclade according to recursive position ix.
subclade(trees::Vector{T},
ltree::sT_label,
tips ::Vector{String},
stem ::Bool) where {T <: iTree}Return the minimum subclade that includes tip labels in tips.
subclade(tree::sT_label, tips::Vector{String})Return the minimum subclade that includes tip labels in tips.
subclade(tree::iTree,
ltree::sT_label,
tips ::Vector{String},
stem ::Bool)Return the minimum subclade that includes tip labels in tips.
Tapestree.INSANE.lλ — Functionlλ(tree::T) where {T <: iT}Return the speciation rate (speciation completion in a protracted model).
Tapestree.INSANE.lμ — Functionlμ(tree::iTbdU)Return the extinction rate.
Tapestree.INSANE.e — Functione(id::iB)Return initial absolute time.
e(tree::T) where {T <: iTree}Return edge length.
Insane tree manipulation functions
Two important manipulation functions are, first to be able to remove extinct lineages, which can be performed on a tree or a tree vector using
remove_extinct(tree)Similarly, as shown above, one can remove the unsampled lineages (all the data augmented lineages) on a single or vector of trees using
remove_unsampled(tree)remove_extinct and remove_unsampled are different. First, when performing simulations, the tree is not fixed, which means that if you run remove_unsampled, you will remove the tree. You would have to fix the tree before, which can be done using fixtree!(tree). Also, if sampling fraction is not $1$, remove_unsampled will also remove lineages alive that were not sampled, while remove_extinct will only remove those lineages extinct.
For fossil trees, one can remove all fossils using
remove_fossils(tree)or make a given tree a fossil by using
fossilize!(tree)which will only make fossil that specific tree (not the recursive daughters).
Full documentation
Tapestree.INSANE.reorder! — Functionreorder(tree::T) where {T <: iTree}Reorder order of daughter branches according to number of tips, with daughter 1 always having more than daughter 2.
reorder!(tree::T, treeda::D) where {T <: iTree, D <: iTree}Reorder data augmented tree treeda according to tree.
Tapestree.INSANE.rm_stem! — Functionrm_stem(tree::T) where {T <: iTree}Removes stem branch.
Tapestree.INSANE.fixtree! — Functionfixtree!(tree::T) where {T <: iTree}Fix all tree.
fixtree!(tree::T) where {T <: iTf}Fix all tree.
Tapestree.INSANE.remove_extinct — Functionremove_extinct(tree::T) where {T <: iTree}Remove extinct tips from iTce.
remove_extinct(treev::Vector{T}) where {T <: iTree}Remove extinct taxa for a vector of trees.
Tapestree.INSANE.remove_unsampled — Functionremove_unsampled(tree::T) where {T <: iTree}Remove unsampled tips from iTree.
remove_unsampled(treev::Vector{T}) where {T <: iTree}Remove unsampled taxa for a vector of trees.