Clarification concerning MACSE alignment strategy
Vincent Ranwez, lecturer Sebastien Harispe, Frederic Delsuc, Emmanuel Douzery
We would like to point out that, contrary to what is reported in the paper by Zhang and colleagues, our multiple sequence alignment program MACSE (Ranwez et al. 2011) does not use "a consecutive three-step approach: (i) translation of nucleotide sequences into protein sequences; (ii) alignment of protein sequences; and (iii) alignment of DNA sequences according to protein alignment". As emphasized in Ranwez et al. (2011) and by the authors of the present paper, "this three-step alignment approach is highly sensitive to frameshift and non-sense mutations". MACSE was precisely developed to provide the first multiple sequence alignment program able to align coding sequences based on their AA translations while accounting for frameshifts. To use Zhang et al.'s formulation, MACSE indeed tackles this problem by "using a 'codon-based alignment' approach where DNA sequences are directly aligned while also considering codon information". This property makes MACSE a particularly useful tool for generating codon alignments subsequently used to estimate dN/dS or Ka/Ks as recently examplified in Drosophila genomes (Assis et al. 2011). We hope that this comment will help inform the authors and readers of Bioinformatics by clarifying the specificities of our program relative to alternative approaches.
References: Assis, R., et al. (2012) Sex-biased transcriptome evolution in Drosophila. Genome Biol Evol, 4, 1189-200. Ranwez, V. et al. (2011) MACSE: multiple alignment of coding SEquences accounting for frameshifts and stop codons. PloS One, 6, e22594. (http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0022594)
Conflict of Interest: We are the authors of the paper describing the MACSE program.
Response to "Clarification concerning MACSE alignment strategy"
Chuanzhu Fan, Assistant Professor Chengjun Zhang, Jun Wang, Manyuan Long
Wayne State UNiversity
Dear Vincent Ranwez,
We appreciate your comments to our recent publication for the miscitation of your paper. We acknowledge, in our opinion, that MACSE is the first multiple sequence alignment program that can automatically handle frame shift mutations and premature stop codon(s) using a codon- based approach. By "using the three reading frames alternatively switching from one to the other at each frameshift", MACSE avoids the pitfalls generated by the traditional three-step approaches, namely, (1) translation of nucleotides to amino acids, (2) alignment of amino acids, and (3) alignment of nucleotides according to amino acid alignments. From our experience and understanding, MACSE is particularly useful to align gene family containing pseudogenes whose gene structure is clearly defined. Due to the page limitation of our article in Bioinformatics, we were not able to discuss MACSE in more detail and mis-cited your paper.
We also want to pinpoint other aspects of the gKaKs pipeline that are different from MACSE and conclude that gKaKs is a useful tool for genome- wide sequence alignment and subsequent substitution rate test.
1. gKaKs can automatically identify genome-wide orthologous sequences from an un-annotated genome and align them against the coding sequences from a reference genome. gKaKs doesn't need to know the coding structure and annotation of the un-annotated genome sequences.
2. gKaKs can deal with the homologous genes that have only partial sequences originating from a common ancestor.
3. gKaKs can automatically compute Ka, Ks, Ka/Ks values based on the alignments after dealing with frame-shift mutations and premature stop codons.
We hope this response emphasizing the contribution of MACSE for the multiple sequence alignment with frame shift mutations and nonsense codons and also distinguishing the usage of gKaKs from other similar programs.
Original Link: Clarification concerning MACSE alignment strategy