In phylogenetic inference, an evolutionarymodel describes the substitution processes along each edge of a phylogenetic tree.
Misspecification of the model has important implications for the analysis of phylogenetic data. Conventionally, however, the
selection of a suitable evolutionary model is based on heuristics or relies on the choice of an approximate input tree. We
introduce a method for model Selection in Phylogenetics based on linear INvariants (SPIn), which uses recent insights on
linear invariants to characterize a model of nucleotide evolution for phylogenetic mixtures on any number of components.
Linear invariants are constraints among the joint probabilities of the bases in the operational taxonomic units that hold
irrespective of the tree topologies appearing in the mixtures. SPIn therefore requires no input tree and is designed to deal
with nonhomogeneous phylogenetic data consisting ofmultiple sequence alignments showing different patterns of evolution,
for example, concatenated genes, exons, and/or introns. Here, we report on the results of the proposedmethod evaluated on
multiple sequence alignments simulatedunder a variety of single-tree andmixture settings for both continuous- and discretetime
models. In the simulations, SPInsuccessfully recovers the underlying evolutionarymodel and is shown to performbetter
than existing approaches.
Phylogenetic networks aim to represent the evolutionary history of taxa. Within these, reticulate networks are explicitly able to accommodate evolutionary events like recombination, hybridization, or lateral gene transfer. Although several metrics exist to compare phylogenetic networks, they make several assumptions regarding the nature of the networks that are not likely to be fulfilled by the evolutionary process. In order to characterize the potential disagreement between the algorithms and the biology, we have used the coalescent with recombination to build the type of networks produced by reticulate evolution and classified them as regular, tree sibling, tree child, or galled trees. We show that, as expected, the complexity of these reticulate networks is a function of the population recombination rate. At small recombination rates, most of the networks produced are already more complex than regular or tree sibling networks, whereas with moderate and large recombination rates, no network fit into any of the standard classes. We conclude that new metrics still need to be devised in order to properly compare two phylogenetic networks that have arisen from reticulating evolutionary process.