I am trying to find WGS vcf files. I see individual-level vcf files per sample. Where can I find joined called vcf files for WGS located? Please do let me know how to access those and use them for the extraction of the gene region I am interested in.?
It works for both bulk files (which include WGS that you need), database, and dataset. You might need rebase for cohort and dashboard. The documentation link for rebase is on the same page.
Thank you so much. I have got a new dataset but the database date was not updated. Does that mean there are no new updates in phenotypes? Also, I am looking for NMR metabolites and only 150/249 are available for me. can you help how to find all metabolites? Also what
QC steps generally were followed for NMR metabolites in ukbiobank. Do you recommend any QC protocol?
Could you post a new question as a separate question? This is a webboard, so people would only read it if they have knowledge about the title of question. If I can't answer your question, then no one would know you have this unanswered question embedded in something completely different.
Hello, Thank you for helping me with data access and the folder for the WGS dataset. I would like to know how the vcf files were set up and how to process the vcf files. I see each vcf file in a bunch of subsets. I tried to run the subset using an army swiss knife and it took a long time even to copy the files to cluster so I though to reach out to you regarding any other way around for processing these files.
Comments
8 comments
Have you refreshed the project? It should be in /Bulk/Whole genome sequences/Population level WGS variants, pVCF format - interim 200k release/
Here is instruction on how to refresh.
https://dnanexus.gitbook.io/uk-biobank-rap/getting-started/updating-dispensed-data?q=refresh
Thank you so much for the response. I don't see the folder that you mentioned. I checked and working on refreshing.
Thank you so much for the guide. This refresh is only for folders or phenotypes database?
It works for both bulk files (which include WGS that you need), database, and dataset. You might need rebase for cohort and dashboard. The documentation link for rebase is on the same page.
Thank you so much. I have got a new dataset but the database date was not updated. Does that mean there are no new updates in phenotypes? Also, I am looking for NMR metabolites and only 150/249 are available for me. can you help how to find all metabolites? Also what
QC steps generally were followed for NMR metabolites in ukbiobank. Do you recommend any QC protocol?
Depending on your version of data before refresh, there might not be new pheno data release. See https://dnanexus.gitbook.io/uk-biobank-rap/getting-started/data-release-versions
Could you post a new question as a separate question? This is a webboard, so people would only read it if they have knowledge about the title of question. If I can't answer your question, then no one would know you have this unanswered question embedded in something completely different.
Hello, Thank you for helping me with data access and the folder for the WGS dataset. I would like to know how the vcf files were set up and how to process the vcf files. I see each vcf file in a bunch of subsets. I tried to run the subset using an army swiss knife and it took a long time even to copy the files to cluster so I though to reach out to you regarding any other way around for processing these files.
Please sign in to leave a comment.