Mutation Selection (MS) model (1, 2) is the core model implemented in PalantiR
. Mutation Selection is a codon substitution model that describes the interplay between mutation rate, selection strength, and population size.
For a substitution from codon \(i\) to codon \(j\): \[ s_{ij} = \psi_j - \psi_i \\ Q_{ij} = 2N_e \mu_{ij} \frac{1-e^{s_{ij}}}{1-e^{-2N_es_{ij}}} \] Only codons differing in one nucleotide positions are considered. Rates of transitions for multiple codon differences are \(0\).
The substitution rates are also scaled to assign particular meaning to branch length of a phylogeny. For more detail, see Rate Scaling.
Load model parameters:
# Load phylogeny
mammals <- Phylogeny("../data/mammals.newick")
# Load nucleotide equilibrium distribution
# Nucleotide order is T,C,A,G
nuc_pi <- c(.3, .2, .25, .25)
# Load amino acid fitness vector
# Amino acid order is:
# A, C, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W, Y
aa_psi <- as.matrix(read.csv("../data/amino_acid_fitness_N_1000.csv"))
Create model:
# Create nucleotide substitution model
hky <- HasegawaKishinoYano(nuc_pi)
# Create codon substitution model
ms <- MutationSelection(
population_size = 1000,
mutation_rate = 1e-8,
nucleotide_model = hky,
fitness = aa_psi[1,])
Sample sequence from model equilibrium distribution to start at root:
s <- sample_sequence(model = ms, length = 100)
Simulate:
sim <- simulate_over_phylogeny(
phylogeny = mammals,
model = ms,
sequence = s)
Plot substitution history:
plot(sim, sites = 0:10)
Synonymous substiutions are shown in green, non-synonymous in red.
Check substitution history table:
head(sim$substitutions)
site | node | time | from | to | codon_from | codon_to | amino_acid_from | amino_acid_to | synonymous | color |
---|---|---|---|---|---|---|---|---|---|---|
0 | 7 | 0.0077853 | 39 | 38 | AAA | AAC | K | N | FALSE | #E84B2A |
0 | 8 | 0.0408456 | 38 | 9 | AAC | TAC | N | Y | FALSE | #E84B2A |
0 | 23 | 0.1364336 | 38 | 22 | AAC | CAC | N | H | FALSE | #E84B2A |
0 | 25 | 0.0172276 | 39 | 38 | AAA | AAC | K | N | FALSE | #E84B2A |
0 | 25 | 0.0749027 | 38 | 42 | AAC | AGC | N | S | FALSE | #E84B2A |
0 | 25 | 0.0794302 | 42 | 43 | AGC | AGA | S | R | FALSE | #E84B2A |
To export FASTA
sequence:
as.fasta(sim$alignment, file = "palantir_sim.fa")
Show alignment in a scrollable viewer:
plot(sim$alignment)
de Koning Lab. 2016-2017 lab.jasondk.io