Hi folks! Ask me anything about UKBRAP or bioinformatics in general! Starting in 90 minutes!

Ben Busby DNAnexus Team

Comments

15 comments

  • Comment author
    Former User of DNAx Community_85

    Hi! Question: How to retrieve all fields (from phenotypic data) for a specific sample or list of samples (provided a file for example)?

    0
  • Comment author
    Former User of DNAx Community_85

    Next questions:

     

    1. What would you suggest has the best way to install a github software on the RAP and use it with the RAP data? Using a Spark Notebook? If yes, do we have to download the data from the RAP into the Spark Notebook environment every time?
    2. What would you say is the most efficient way to handle tiny parts of multiple pVCFs for the same samples?

    E.g.: I want to use specific genes to perform an analysis (and I need their variants' qualities as well) but they are split in ten different pVCFs. Is the easiest/time/cost efficient way to download all these 10 pVCFs to a Spark notebook and perform the analyses or make a smaller VCF with all genes first, outside of a Spark notebook and then load these tinier VCF to the Spark notebook?

    0
  • Comment author
    Ben Busby DNAnexus Team

    Just saw these!

     

    1. You can use git from either a cloud workstation or a jupyter notebook. Personally, I use git from both.
      1. If you are talking about <10 GB of data, you can use the Create Snapshot function for JupyterLab so you dont have to move the data each time.
      2. In a similar vein, you can use the dx-snapshot function in cloud workstation and boot from there.
    2. I'd use bedtools or vcftools in the swiss-army-knife app. [Shameless plug] I'm going to be in a webinar with regeneron and NVIDIA on 2/17 and Ill go through how to subset data using bedtools.
    0
  • Comment author
    Ben Busby DNAnexus Team

    I'm back, in case folks have questions!

    0
  • Comment author
    Former User of DNAx Community_61

    Hi, I just created a project with UKBB datasets. I am trying to use cohort browser, or JupyterLab to select samples, but I cannot find "Dataset" or "cohort browser" as shown in the documentation anywhere in the interface. Do I miss anything obvious here?

    0
  • Comment author
    Former User of DNAx Community_61

    I think I figured it out. There is a `.dataset` file under the project directory, which should be the one to be used in JupyterLab (not tried yet). Then, "explore" -> "Add filter" etc should be the "cohort browser" that is shown in the documentation.

    0
  • Comment author
    Ben Busby DNAnexus Team

    Hi Bo! If you select the dataset and then click on the graph icon in the upper right hand corner, it will take you to the cohort browser. Please let me know if you have any issues!

     

    Ben

    0
  • Comment author
    Former User of DNAx Community_61

    Thanks Ben, I thought I get it but maybe I did not. Right now, I have a dataset file. By either clicking the filename, or clicking "Explore Data" from the 'three-dot' menu to the right, I get in an interface with "Dashboard Actions" to the top right corner., and "add filter" in the middle using which I can filter subjects. Isn't this the "cohort browser"?

    0
  • Comment author
    Ben Busby DNAnexus Team

    Yep! Have you been able to add tiles yet? I find them helpful when Im starting to scope a data problem

     

    Ben

    0
  • Comment author
    Former User of DNAx Community_61

    I am learning the cohort browser by adding filters. The cohort browser (although I cannot find such a name anywhere) seems to be easier to use than the JupyterLab/Spark method I saw from another tutorial but I suppose that method is more suitable for batch processing. I will continue to explore the system and ask if I get any question.

    0
  • Comment author
    Former User of DNAx Community_61

    @Ben Busby? This forum may not be the best place to ask MTA-related questions, but are you aware of any restrictions on adding our own data to a UKBB project and analyze in conjunction with UKBB data? I browsed through the MTA and did not find an answer (I did not read it word by word).

    0
  • Comment author
    Ben Busby DNAnexus Team

    Awesome!

    0
  • Comment author
    Ben Busby DNAnexus Team

    Great!

    0
  • Comment author
    Former User of DNAx Community_61

    Hi, @Ben Busby? What is the recommended way to annotate variants that I see from the genome section of the cohort browser? I see a number of variants that I am interested in, I would like to know if they are pathogenic (using CLINVAR or genomAD) and then find their carroers (and check carriers' phenotpes as the next step). I know that I can download the variants and run an annotator offline, but I am wondering what is the best way to do that on DNAnexus. Thanks.

    0
  • Hii Ben!

    I have a question.

     

    Let's say I want to perform a linear or logistic regression as Phenotype ~ Genotype + Covariates.

     

    I want to perform this task in the Rstudio/JupyterLab using R. I understood how to use the cohort browser to select a cohort, but I need the data on unrelated individuals only. I do not understand how to create a cohort of unrelated individuals. Then how to add genotype data for the same set of individuals to perform my analysis.

     

    Please help me to understand how I can load phenotype, genotype, and covariates data to RAP's Rstudio/ JupyterLab for my analysis. I have been stuck in this for too long.

    0

Please sign in to leave a comment.