Is there a map for which regions are in each 500k Whole EXOME Sequencing final release pVCF (23157) block and for the alternative gnoMAD pVCF (24068)?
Is there a map for which regions are in each 500k final release WES pVCF (23157) block and for the alternative gnoMAD pVCF (24068)?
There is a similar discussion regarding the WGS 200k release (https://community.dnanexus.com/s/question/0D5t0000048q6XmCAI/is-there-a-map-for-which-regions-are-in-each-200k-wgs-pvcf-block?t=1697617034843) , but I can't confirm anywhere that the blocks are the same for WGS and the WES main and gnomad release.
Alternatively can someone suggest a simple way of identifying the blocks?
I could fairly simply extract the first called position from each VCF to create an index, but this isn't quite the same as knowing the blocks.
Admins - surely this is a resource that should be prepared/offered centrally - it's pretty for use of these files?
This content is a preview of a link.
community.dnanexus.com
Comments
5 comments
@UK Biobank DA Team? thanks for helping with this question!
For field 23157 and 24068, UKB does not currently have genomic ranges for each block such as those provided in Resource 837 for field 23156. We will endeavour to provide similar resources for the other genetic files in the future. In the meantime, there are files listing the genomic position of the first variant in each file for field 23157 and for 24068. These are not currently on the RAP but can be requested by email from access@ukbiobank.ac.uk .
Rachael, I've requested these files but they forwarded my request to the Data Management team instead of sending it - did this happen with you? Just wondering if there was something I should have said in my email to streamline receiving it, or if it's not straightforward. I was hoping to be able to use it before they shut down through the new year. Thanks for any advice!
Hi Lydia, your email has reached the Data Analyst team, and one of my colleagues has dealt with it and passed it back to the access team (main point of contact), so I would expect access to respond to you within a day or two, probably less unless they have a crisis.
In general, most of the queries you have would be answered by the main Access team, or by the Data Analysts (which includes the BioInformaticians), but it could also be the Science Team, the Health Data Group or the Epidemiology Group. I'm not sure how you can avoid going round the houses, but the more information you can include in your email the better.
Thank you, that's great to know! Sorry, I didn't realize you were part of UKB, I misunderstood your previous answer. I really appreciate this, and will look forward to receiving the file when possible. (Will also keep my fingers crossed for no crises, for lots of reasons!)
Please sign in to leave a comment.