variant calling


New member
we’d like to process large metagenomics data (illumina) against a custom DB made of selected bacterial genes (we have both nucs and protein translation) to identify non synonymous mutations at the amino acid level. Sine we can process FASTQ files with and get SAM output with DIAMOND, we thought it’d be easier that the classical nucleotide mapping pipeline (because we don’t have a GFF file to annotate the VCF).
However, we are puzzled about the variant filtering of the SAM file based on quality and coverage. Is there any downstream option for the protein mutations, like the GATK/FREEBAYES methods for nucleotide mapping?