Reference genome for DRAGEN WGS 500k data?
Is this the correct genomes using for working with the cram files for DRAGEN WGS 500 data:
GRCh38_full_analysis_set_plus_decoy_hla.fa
Thanks,
Is this the correct genomes using for working with the cram files for DRAGEN WGS 500 data:
GRCh38_full_analysis_set_plus_decoy_hla.fa
Thanks,
Comments
5 comments
Hi Robin, I am not a geneticist, and I do not understand exactly what is present in file Bulk / Exome sequences / Exome OQFE CRAM files / helper_files / GRCh38_full_analysis_set_plus_decoy_hla.fa .
However, I can confirm that both the Exome sequence data and the 500k WGS Dragen data in the UKB is using GRCh38.
If you need more precise information, please see https://www.ukbiobank.ac.uk/media/dovbae03/uk-biobank-final-whole-genome-sequencing-release-faqs_v1-0.pdf and https://www.medrxiv.org/content/10.1101/2023.12.06.23299426v1 , or post a more detailed question here.
I found out that the reference was this file, but how do I download it to my project:?
/illumina/scratch/DRAGEN/data/vault/reference_genomes/Hsapiens/hg38/seq/hg38.fa
Hi Robin, one of our BioInformaticians has confirmed that you were correct in the first place:
Bulk / Exome sequences / Exome OQFE CRAM files / helper_files / GRCh38_full_analysis_set_plus_decoy_hla.fa is the correct reference genome file for both UKB WES and UKB WGS
Brilliant. This info could be nice to have in some documentation with the DRAGEN cram files for future users.
Thanks!
Your suggestion has been noted, thanks.
By the way, if you do need to move files in and out of the RAP, the dx toolkit is useful, see https://documentation.dnanexus.com/downloads
Please sign in to leave a comment.