extracting allele combinations at specific SNPs
Hello,
The instructions provided via https://github.com/UK-Biobank/SNP-filtering do not help. I get the error message:
File Load Error for SNP-extraction_test.ipynb
Unreadable Notebook: C:\Users\tabassimofrads\Desktop\SNP-extraction_test.ipynb NotJSONError("Notebook does not appear to be JSON: '\n\n\n\n\n\n\n<html\n l...").
Any help would be appreciated.
Kind regards,
Simeen
Comments
18 comments
That is odd, it worked for me only a few weeks ago. I will check it again. Have you seen this related forum thread https://community.ukbiobank.ac.uk/hc/en-gb/community/posts/18669657313437-How-do-I-extract-allele-combinations-at-specific-SNPs-using-Jupyterlab
I think you might need to copy the script into your main project storage area.
Thank you for your reply - the previous issue is now fixed. However to install plink it is recommended to use the codes:
conda install bioconda::plink
conda install bioconda/label/cf201901::plink
None of them works and I get this error message:
PS C:\Users\tabassimofrads\Desktop> conda install bioconda::plink
Channels:
- defaults
- bioconda
Platform: win-64
Collecting package metadata (repodata.json): done
Solving environment: failed
PackagesNotFoundError: The following packages are not available from current channels:
- bioconda::plink
I have installed Anaconda on Windows.
I also need to have access to the files for genotype SNP data. Via DPUK I have access to UK Biobank but it is not clear where genotype SNP data are stored. An example of the data from one participant can be downloaded via : Resource 1963 (ox.ac.uk) - the tar file.
Please could you let me know where these data for whole participants can be found?
Thank you.
Are you trying to use some ukb genetics data that you have previously downloaded in a basket (before the policy change in May 2024), or are you trying to use the Research Analysis Platform?
I have access to UK Biobank via DPUK Data Portal (analyses platform). I am a collaborator to the project 15697. Does this answer your question?
Genotype calls are in Field 22418, which is in RAP folder Bulk > Genotype results > Genotype calls
Hello again, I am still confused. Before May 2024, your project downloaded several baskets of UKB data for use in your institute. There is a cloud platform where all new projects will be working, called the RAP. Old projects are either using downloaded data copies, or the RAP cloud platform. Which are you using?
So far as I can see, I don't think your project has already downloaded the genotyping data, nor extracted individual SNPs from the genotyping data, though there was some other genetic data (50k exome) which would have the data you need (but only for 50k participants), but not in an easy-to-use format.
Your project's first basket from 2018 might be of interest.
I suspect you would have noticed if you were using the RAP, which has a login page like this
If you already have the genetic data in your institute, then you can continue to use it, but the GitHub notebook will not be directly applicable, though you might want to refer to it.
If you do not already have the genetic data downloaded, then you will need to use it within the RAP (no download allowed).
If you have not already used the RAP, then you will need to complete initial RAP training courses before you can use it, see https://community.ukbiobank.ac.uk/hc/en-gb/articles/19997160485533-Mandatory-training
The DPUK would have facilitated your project's application to use UK Biobank data, but that data would have been downloaded via UKB baskets, and any new data will need to be used only within the RAP.
The no-download policy was sent to researchers by email, and can also be viewed https://community.ukbiobank.ac.uk/hc/en-gb/community/posts/19996604847133-Changing-the-way-UK-Biobank-data-is-made-available-to-researchers-around-the-world . See also the pink banner on this forum page, which highlights the announcement and some FAQ articles.
Dear Rachael,
Many thanks for your replies.
I have already completed GDPR at UCL. Would that be enough to access RAP?
Hi Simeen, No, it would not. It has to be the MRC GDPR course.
Hi Rachael
Thank you for letting me know. I assume that is the only mandatory course?
Introduction to using UK Biobank data and Introduction to the UK Biobank Research Analysis Platform (UKB-RAP) do not have any link to the courses.
There will be 3 mandatory courses, see https://community.ukbiobank.ac.uk/hc/en-gb/articles/19997160485533-Mandatory-training .
Course 2 will be quite short (about an hour). Course 3 will be slightly longer and more technical, and we hope you will find it saves time in the long run.
None of the new courses are listed in the current documentation, as they are still being finalised and the documentation has not been updated.
Thank you - so if the courses are not finalised yet, please could you let me know how one can access RAP now?
Researchers who have not previously accessed the RAP are not able to access the RAP now.
Have you checked what was downloaded in your project's first basket in 2018?
I see folders from 2019 onward but not from 2018.
So if the genotyping data were not downloaded and I had not previously accessed the RAP, how I could access the genotyping data that I need?
I cannot access the data directly via UK Biobank as a collaborator to the project 15697?
Not immediately, no. You will need to wait for the mandatory RAP training courses.
Please sign in to leave a comment.