Search results

  1. B

    Human metagenomic data, two-stage or one-stage approach?

    Feel free to share your findings! I'm not an expert on this so I'll hold off on giving a recommendation.
  2. B

    sensitivity/accuracy of fast then sensitive search

    I haven't tried this approach and don't really have any numbers for it. It should work quite well when only looking for the best hit for each query, but it also depends on the data. If say >80% of your queries have best hits with >60% identity, this should work well and improve the performance...
  3. B

    Compiling DIAMOND from source for Windows?

    You can download an executable here: I compile them using MS Visual Studio.
  4. B

    species name output

    You need to use the latest commit for this. Clone the diamond repository like this: git clone Then compile from source as described in the manual.
  5. B

    species name output

    It has been added recently, please see here:
  6. B

    --taxonlist error

    --taxonlist takes a list of numerical values (taxon ids). So for searching all bacterial proteins, use --taxonlist 2. Btw, --taxonnodes and --taxonmap need only be added to makedb.
  7. B

    taxon mapping with alternate databases?

    It is possible but requires some manual labour. You would have to create a mapping file from protein accessions to taxon ids in the same format as the NCBI file, only for Uniref50. I will put it on my todo list to provide an integrated way for this but it may take some time.
  8. B

    Joinging output blocks...seg fault error

  9. B

    Joinging output blocks...seg fault error

    Send to
  10. B

    Joinging output blocks...seg fault error

    No, certainly not, unless there's nothing in there that matches the database in any way. Could you upload your data file or a small part of it so I can check it myself?
  11. B

    Joinging output blocks...seg fault error

    Can you tell me what version of diamond you are using and your OS?
  12. B

    qseq and sseq in -outfmt 6 are not aligned

    Sorry it took a bit longer. In the latest commit I have added the fields qseq_gapped and sseq_gapped.
  13. B

    Diamond makedb error: Error: Invalid nodes.dmp file format.

    You need to unpack the first. The parameter to the --taxonnodes needs to be the nodes.dmp file contained within the
  14. B

    diamond Error: Inflate error

    You have to use the -b paramter for calling diamond blastp or diamond blastx, not diamond makedb.
  15. B

    diamond Error: Inflate error

    -c and -b are both options for diamond blastx. You should not change -c, just use a lower number for -b as recommended above.
  16. B

    diamond Error: Inflate error

    Try to use a lower number for the block size, like this: -b1
  17. B

    diamond Error: Inflate error

    This indicates an error in the input file. Try to test the integrity of the file like this: gzip -tv nr.gz If that shows an error, try to download the file again. Also, the parameter of the --taxonnodes option needs to be the nodes.dmp file contained in the file, not that file itself.
  18. B

    CPU time limit exceeded error

    This appears to be an error from your execution environment, not from Diamond. I assume you are running Diamond in a cluster environment? You need to check with your cluster manual and admins how to allocate more CPU time to the job.
  19. B

    Expected runtime?

    This may take a couple of hours against the NR as Diamond is not that efficient when the number of queries is small. When aligning contigs it would be useful to use the --long-reads option. If your server has a lot of memory, a good way to speed it up is to use a higher block size like this...
  20. B

    qseq and sseq in -outfmt 6 are not aligned

    Hi Romain, no problem, I will add that feature, but probably make it a new output field instead of changing the behaviour of the current ones, otherwise it will break the compatibility with previous Diamond versions. Best regards, Benjamin