For pVCF best practices how do I reference the GRCh38 fasta? The documentation provides: bcftools norm -m - -f <reference> -Oz -o <normVCF> <inputVCF> as the call to use but I do not know how to designate path to grc38.reference.fasta
Is the mentioned "grc38.reference.fasta" located in any of your UKB RAP projects? What have you already tried to load it? Or what is the particular command you would like to implement?
There basically two options:
1) download "grc38.reference.fasta" onto a worker via "dx download"
2) read it via dxfuse, i.e. /mnt/project/...
0
Permanently deleted user
The command I wanted to implement was: bcftools norm -m - -f <reference> -Oz -o <normVCF> <inputVCF>
I managed to find the solution.
I achieved this by using URL fetcher on the reference fasta suggested by UKB to download it to my project, then included it as an input into SAK!
Comments
4 comments
Are you referring to this doc page? https://dnanexus.gitbook.io/uk-biobank-rap/science-corner/whole-exome-sequencing-oqfe-protocol/protocol-for-processing-ukb-whole-exome-sequencing-data-sets#conversion-of-pvcf-to-plink-and-bgen-files
Is the mentioned "grc38.reference.fasta" located in any of your UKB RAP projects? What have you already tried to load it? Or what is the particular command you would like to implement?
There basically two options:
1) download "grc38.reference.fasta" onto a worker via "dx download"
2) read it via dxfuse, i.e. /mnt/project/...
The command I wanted to implement was: bcftools norm -m - -f <reference> -Oz -o <normVCF> <inputVCF>
I managed to find the solution.
I achieved this by using URL fetcher on the reference fasta suggested by UKB to download it to my project, then included it as an input into SAK!
would you share the specific grc38.reference.fasta URL that was suggested for this data? I could not find the file in RP . many thanks!
https://ftp.ncbi.nlm.nih.gov/1000genomes/ftp/technical/reference/GRCh38_reference_genome/GRCh38_full_analysis_set_plus_decoy_hla.fa
Please sign in to leave a comment.