How to match EIDs with rsIDs?
I get a cohort when I use cohort filter of one specific gene, I can see the rsIDs of this group of people, but for every participant I want to know what is his/her SNP information, which means EID and rsID in a same row. How to achieve that? Thank you!
Comments
2 comments
I don't think that can be done entirely within the cohort-browser GUI. One way to achieve it would be to Save the cohort to a folder within your project storage, and then open a jupyterlab instance from the Tools tab, and read the list of EIDs in the saved cohort into a program in R or python. You could then combine it with your saved phenotype data.
This video on using the jupyterlab may be useful https://dnanexus.gitbook.io/uk-biobank-rap/working-on-the-research-analysis-platform/using-jupyterlab-on-the-research-analysis-platform
If you wish to combine SNP (such as genotype) with phenotypic information, the pVCFs may be useful.
You can generate a list of participants that you are interested based on their phenotypic information using the cohort browser or following these github notebooks https://github.com/UK-Biobank/UKB-RAP-Notebooks/tree/main/NBs_Prelim
Then filter the pVCFs to contain the participants/SNPs of interest (ie bcftools - which can be run as a job using dx run swiss-army-knife https://dnanexus.gitbook.io/uk-biobank-rap/working-on-the-research-analysis-platform/tools-library)
pVCFs (population VCFs) are divided into 20kb blocks per chromosome and are numbered sequentially.
Please sign in to leave a comment.