Can this pipe line used for genomes with far evolutionary relationship? Which of the genomes analyzed fall into this catregory?
We try to figure out this problem by calculation four genome data.
|Species||Family||Order||blat result size default option(Mb)||add blat_option = "-minIdentity=30"||add blat_option= "-minIdentity=30 -t =dnax -q=dnax"||#bl2seq_option = ""||
#bl2seq_option = -q -2 / -r 2 and min_length=50 min_identity=0.6
|D.mel (CDS)||Drosophilidae||Diptera||/||/||/||4578 (mRNA)||/||/|
|D.sim (v1.4)||Drosophilidae||Diptera||23M||/||/||3958 results (3957 are valuable*, use D.mel chr4 )||4147 results (4094 are valuable, use D.mel chr4 )||~3 MY|
|D.pse (v2.29)||Drosophilidae||Diptera||13M||27M||/||397 results (389 are valuable, use D.mel chr4 )||655 results (493 are valuable, use D.mel chr4 )||~26.5 MY|
|Apis mellifera (v2.0.15 not rm)
||Apidae||Hymenoptera||7.5M||12M||82M||6 results (only one is valuable) with defaly option (use D.mel chr4 )||239 (33 are valuable)||>>60 MY|
|Aedes aegypti (v1.15 not rm)||Culicidae||Diptera||4.9M||9.4M||92M||8 results with default option (use D.mel chr4 )||169 (93 is valuable)||>>60 MY|
* valuable means that the Ka, Ks is the number, not the "nan"
# these data is from the blat result without any option
From this table, we may give an conclusion that this pipe line can works for species that divergence less that 30MY. And it should be suitable for the genomes that divergence less that 10 MY.
If you want to try to use this pipe line to calculate the genomes with far evolutionary relationship, you'd better add the option blat_option = "-minIdentity=30 -t =dnax -q=dnax", add the bl2seq_option = "-q -2" or add the bl2seq_option = " -r 2", and reduce the option min_length and min_identity.