diamond blastp produce longer alignments with lower identity than ncbi blast

biobiu

New member
Hi all,
I'm using diamond blastp as a (great) alternative for blastp, mainly in order to save time.
Although I made efforts to make the alignment with similar parameters as much as possible I find that in most of the cases ncbi blast will prefer smaller alignment length (with higher identity) than extending the alignment with the price of many mismatches (see commands below).
Here is a representative example:
NCBI blastp align 20aa with 16 identities (80%) diamond extended this alignment, from both sides resulting in alignment of 57aa with 33 identities (57%), with several consecutive mismatches. It does not seems to be for specific alignment because the histograms of %identity and alignment length looks different.
I'm pretty sure it is related to the -ungapped flag in ncbi blastp.
So, Any idea if there is an alternative to the -ungapped flag in diamond? Is there a way to prefer less mismatches and smaller alignment?
Commands:

blastp -query prot.faa -db blastdb -out blast.txt -matrix PAM30 \
-ungapped -comp_based_stats F -window_size 0 -xdrop_ungap 1 \
-evalue 1e-3

diamond blastp --query prot.faa --db diamond_db.dmnd \
--out diamond.txt --matrix PAM30 --comp-based-stats 0 \
--window 0 --evalue 1e-3

Thank you!
 

Benjamin Buchfink

Administrator
Staff member
Diamond does not use a xdrop-like algorithm for extension, so you can't really make it prefer shorter alignments that way. You would have to tweak the scoring matrix and set higher penalties for mismatches. Support for custom scoring matrices and ungapped-only alignments is not officially available at the moment, but I will see what I can do about that.
 
Top