PAML demo website: http://myweb.dal.ca/js551958/PAML_lab/lab.html
The details of the PAML demo can be found at the above website. Go to the above website first.
For those interested, a more in-depth description of how to modify tree files for branch isolation (similar to exercise #3) can be found here.
Alternatively, Support Protocol 3 within UNIT 6.15 (Current Protocols in Bioinformatics) is devoted to labelling the foreground branch of a Newick tree. The entire unit (including sequence and tree files) can be obtained here: Bitbucket repository
For the lab
Log into the cluster
To run codeml:
codeml <name of control file>
Plotting for Exercise 1
Here are a couple examples of ways in which you might choose to plot your results once you've collected them.
ColumnA --> Omegas ColumnB --> lnL values
Navigate to: Insert --> Charts --> Scatter (choose lines)
Right click on X-axis (omegas) --> Format Axis --> Logarithmic scale
x_vector <- c(your parameter values, separated by commas)
For example: x_vector <- c(.005,.006,.007,.008)
y_vector <- c(your likelihoods, separated by commas) plot(x = x_vector, y =y_vector, log="x", type="b")
This is not part of the exercises, but you will need this information for running these kinds of analyses on your own datasets.
To fit codon models, you need a codon alignment (i.e. a nucleotide alignment that, when translated, corresponds perfectly to a good amino acid alignment). You can't just align nucleotides because most alignment programs don't look for codons, so the program may put gap characters in the middle of codons, throwing the sequences out of frame. You have to align the amino acids, and then force your unaligned nucleotides to fit that alignment.
We have shown you how to translate the nucleotides in SeaView and align amino acids using MAFFT or Muscle, but not how to get back to codons. The Pal2Nal server is good for this. You will upload your protein alignment and nucleotide sequences, and it will spit out the codon alignment. Please be aware that your raw nucleotides must be multiples of three (i.e. a full open reading frame) and free of stop codons.