Hi folks! Ask me anything about UKBRAP or bioinformatics in general! Starting in 90 minutes!

08 February 2022 19:32

Hi! Question: How to retrieve all fields (from phenotypic data) for a specific sample or list of samples (provided a file for example)?

0

Former User of DNAx Community_85

08 February 2022 20:10

Next questions:

What would you suggest has the best way to install a github software on the RAP and use it with the RAP data? Using a Spark Notebook? If yes, do we have to download the data from the RAP into the Spark Notebook environment every time?
What would you say is the most efficient way to handle tiny parts of multiple pVCFs for the same samples?

E.g.: I want to use specific genes to perform an analysis (and I need their variants' qualities as well) but they are split in ten different pVCFs. Is the easiest/time/cost efficient way to download all these 10 pVCFs to a Spark notebook and perform the analyses or make a smaller VCF with all genes first, outside of a Spark notebook and then load these tinier VCF to the Spark notebook?

0

Ben Busby DNAnexus Team

09 February 2022 13:41

Just saw these!

You can use git from either a cloud workstation or a jupyter notebook. Personally, I use git from both.
1. If you are talking about <10 GB of data, you can use the Create Snapshot function for JupyterLab so you dont have to move the data each time.
2. In a similar vein, you can use the dx-snapshot function in cloud workstation and boot from there.
I'd use bedtools or vcftools in the swiss-army-knife app. [Shameless plug] I'm going to be in a webinar with regeneron and NVIDIA on 2/17 and Ill go through how to subset data using bedtools.

0

Ben Busby DNAnexus Team

09 February 2022 19:25

I'm back, in case folks have questions!

0

Former User of DNAx Community_61

09 February 2022 21:51

Hi, I just created a project with UKBB datasets. I am trying to use cohort browser, or JupyterLab to select samples, but I cannot find "Dataset" or "cohort browser" as shown in the documentation anywhere in the interface. Do I miss anything obvious here?

0

Former User of DNAx Community_61

10 February 2022 05:58

I think I figured it out. There is a `.dataset` file under the project directory, which should be the one to be used in JupyterLab (not tried yet). Then, "explore" -> "Add filter" etc should be the "cohort browser" that is shown in the documentation.

0

Ben Busby DNAnexus Team

10 February 2022 13:37

Hi Bo! If you select the dataset and then click on the graph icon in the upper right hand corner, it will take you to the cohort browser. Please let me know if you have any issues!

Ben

0

Former User of DNAx Community_61

10 February 2022 14:26

Thanks Ben, I thought I get it but maybe I did not. Right now, I have a dataset file. By either clicking the filename, or clicking "Explore Data" from the 'three-dot' menu to the right, I get in an interface with "Dashboard Actions" to the top right corner., and "add filter" in the middle using which I can filter subjects. Isn't this the "cohort browser"?

0

Ben Busby DNAnexus Team

10 February 2022 14:36

Yep! Have you been able to add tiles yet? I find them helpful when Im starting to scope a data problem

Ben

0

Former User of DNAx Community_61

10 February 2022 15:49

I am learning the cohort browser by adding filters. The cohort browser (although I cannot find such a name anywhere) seems to be easier to use than the JupyterLab/Spark method I saw from another tutorial but I suppose that method is more suitable for batch processing. I will continue to explore the system and ask if I get any question.

0

Former User of DNAx Community_61

10 February 2022 17:53

@Ben Busby? This forum may not be the best place to ask MTA-related questions, but are you aware of any restrictions on adding our own data to a UKBB project and analyze in conjunction with UKBB data? I browsed through the MTA and did not find an answer (I did not read it word by word).

0

Ben Busby DNAnexus Team

10 February 2022 19:17

Awesome!

0

Ben Busby DNAnexus Team

11 February 2022 14:37

Great!

0

Former User of DNAx Community_61

11 February 2022 18:05

Hi, @Ben Busby? What is the recommended way to annotate variants that I see from the genome section of the cohort browser? I see a number of variants that I am interested in, I would like to know if they are pathogenic (using CLINVAR or genomAD) and then find their carroers (and check carriers' phenotpes as the next step). I know that I can download the variants and run an annotator offline, but I am wondering what is the best way to do that on DNAnexus. Thanks.

0

Former User of DNAx Community_34

23 March 2023 09:51

Hii Ben!

I have a question.

Let's say I want to perform a linear or logistic regression as Phenotype ~ Genotype + Covariates.

I want to perform this task in the Rstudio/JupyterLab using R. I understood how to use the cohort browser to select a cohort, but I need the data on unrelated individuals only. I do not understand how to create a cohort of unrelated individuals. Then how to add genotype data for the same set of individuals to perform my analysis.

Please help me to understand how I can load phenotype, genotype, and covariates data to RAP's Rstudio/ JupyterLab for my analysis. I have been stuck in this for too long.

0

Comments