taxon mapping with alternate databases?


New member
I'm looking at alternate dbs for diamond - I saw in a previous thread on github that you recommended uniref50 as a potentially faster db. If using a db other than NCBI nr, is it possible to create alternate taxonmap and taxonnodes files? Uniref50 encodes some taxonomy information within the fasta file - is it possible to get diamond to use that information?
thank you!

Benjamin Buchfink

Staff member
It is possible but requires some manual labour. You would have to create a mapping file from protein accessions to taxon ids in the same format as the NCBI file, only for Uniref50.

I will put it on my todo list to provide an integrated way for this but it may take some time.