Co-Evolution is an extension of a Mutation-Selection codon model, which considers pairs of codons.
The model adds an amino-acid correlation matrix \(\delta\), which expresses how “linked” two amino acid states are within a pair.
For example, \(\delta_{A, K} > 1\) means that \(AK\) is a favoured pair, while \(\delta_{W, Y} < 1\) means \(WY\) is disfavoured. \(\delta = 1\) signifies lack of association.
The selection coefficient is calculated for a pair of sites, assuming additive fitness.
For a substitution from codon pair \(ij\) to codon pair \(kl\):
\[ s_{ij, kl} = (\psi_k + \psi_l)\delta_{kl} - (\psi_i + \psi_j)\delta_{ij} \\ Q_{ij, kl} = 2N_e \mu_{ik} \mu_{jl} \frac{1-e ^ {s_{ij, kl}}}{1-e^{-2 N_e s_{ij, kl}}} \] Only codons differing in one nucleotide positions are considered. Rates of transitions for multiple codon differences are \(0\).
Load model parameters:
# Load phylogeny
mammals <- Phylogeny("../data/mammals.newick")
# Load nucleotide equilibrium distribution
# Nucleotide order is T,C,A,G
nuc_pi <- c(.3, .2, .25, .25)
# Load amino acid fitness vector
# Amino acid order is:
# A, C, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W, Y
aa_psi <- as.matrix(read.csv("../data/amino_acid_fitness_N_1000.csv"))
# Create a "flat" delta matrix - all pairs are independent
delta <- matrix(rep(1, 20 * 20), nrow = 20)
Create model:
# Create nucleotide substitution model
hky <- HasegawaKishinoYano(nuc_pi)
# Create codon pair substitution model
co <- CoEvolution(
population_size = 1000,
mutation_rate = 1e-8,
nucleotide_mode = hky,
fitness_1 = aa_psi[1,],
fitness_2 = aa_psi[2,],
delta = delta)
Sample sequence from model equilibrium distribution to start at root:
s <- sample_sequence(model = co, length = 100)
Simulate:
sim <- simulate_over_phylogeny(
phylogeny = mammals,
model = co,
sequence = s)
Plot substitution history:
plot(sim)
Synonymous substiutions are shown in green, non-synonymous in red.
Check substitution history table:
head(sim$substitutions)
site | node | time | from | to | pair_from | pair_to | pair_amino_acid_from | pair_amino_acid_to | first_synonymous | second_synonymous | first_color | second_color |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 66 | 0.2115673 | 3288 | 2800 | GAT,GAA | GTT,GAA | D,E | V,E | FALSE | TRUE | #E84B2A | #16A35E |
3 | 45 | 0.0371289 | 1338 | 850 | CAT,GGT | CTT,GGT | H,G | L,G | FALSE | TRUE | #E84B2A | #16A35E |
4 | 37 | 0.0562539 | 3359 | 3372 | GAA,TCT | GAA,CCT | E,S | E,P | TRUE | FALSE | #16A35E | #E84B2A |
10 | 24 | 0.0287241 | 1193 | 1254 | CCA,ACC | CCG,ACC | P,T | P,T | TRUE | TRUE | #16A35E | #16A35E |
10 | 61 | 0.0048337 | 1193 | 3145 | CCA,ACC | GCA,ACC | P,T | A,T | FALSE | TRUE | #E84B2A | #16A35E |
10 | 66 | 0.0363491 | 1193 | 2169 | CCA,ACC | ACA,ACC | P,T | T,T | FALSE | TRUE | #E84B2A | #16A35E |
To export FASTA
sequence:
as.fasta(sim$alignment, file = "palantir_sim.fa")
Show alignment in a scrollable viewer:
plot(sim$alignment)
de Koning Lab. 2016-2017 lab.jasondk.io