We wanted to do regression analysis on a subpopulation of 10,000 UK Biobank participants for 12 imputed SNPs on the RAP platform. It would be very helpful if you could guide us on how to proceed with the analysis on the RAP platform.

Comments

10 comments

  • Comment author
    Ondrej Klempir DNAnexus Team

    If you are interested in computing GWAS analysis, some steps published in the following tutorial might be useful:

    https://dnanexus.gitbook.io/uk-biobank-rap/science-corner/gwas-ex

    0
  • Comment author
    Former User of DNAx Community_93

    Thank you. This is helpful but I just want to run GWAS for only 12 imputed SNPs and not the entire imputation data for all chromosomes. Since this is a time-sensitive project you could let me know any specific commands to screen out specific SNPs for association analysis with the phenotype in your platform.

    0
  • Comment author
    Chai Fungtammasan DNAnexus Team

    You can use Swiss-army-knife tool on the platform which has most basic bioinformatics tools installed (bcftools, plink, etc) to filter the data and then use the tutorial above to run GWAS.

    See video on how to use swiss-army-knife here. https://youtu.be/8bcHeoEggBI?t=2110

    0
  • Comment author
    Former User of DNAx Community_93

    Thank you it helped and I was able to screen out specific SNPs . I would like to know how do I link my phenotype eids to genotypic FID or IID?

     

     

    0
  • Comment author
    Chai Fungtammasan DNAnexus Team

    They are all set the EID based on this tutorial. See this answer here.

    https://community.dnanexus.com/s/question/0D5t00000414p2DCAQ/can-you-match-plink-output-to-eids

    0
  • Comment author
    Former User of DNAx Community_93

    I used the bgenix tool to extract specific SNPs.

    This is the command that I used

    bgenix -g ukb_imp_chr${CHR}_v3.bgen \

    -i ukb_imp_chr${CHR}_v3.bgen.bgi \

    -vcf -incl-rsids ${RSID} | \

    bcftools reheader \

    -h bgen_to_vcf/new_header.txt | \

    bcftools annotate \

    --rename-chrs bgen_to_vcf/rename_contigs.txt | \

    bgzip -c > new_file.vcf.gz && tabix -p vcf new_file.vcf.gz

     

    -h bgen_to_vcf/new_header.txt instead of new_header.txt I replaced with my sample file.

     

    But the genotype FID generated for the vcf file were named as anonymous sample 1 and so forth which does not match the EID or ID present in sample file.

     

    This work is on deadline so I cannot work with the entire imputed data hence is there someway we can get the same ID for my vcf file as the EID for these specific filtered SNPs using the above commands.

     

     

     

    0
  • Comment author
    Chai Fungtammasan DNAnexus Team

    This is a pure bioinformatics question. I can give recommendation for what you can try, but I don't know all detail for all these commands without going through the manual.

    1) you might want to figure out which command in those chain that remove you sample id. Just break the pipe and inspect output from each of them. After you know that, you can try to focus on manual of those tools if you can modify your commend to get different results.

    2) you can do a quick check if the sample order has been switch. If you are confident that the order is preserved, reheader or some sort of mapping id might help.

    3) You may post a new question in this community if other members could chime in. Or you could also try Biostar which is would have a larger community for bioinformatics users.

    0
  • Comment author
    Former User of DNAx Community_93

    I introduced the sample id into the vcf file but when I am trying to convert the vcf file to plink files it is giving an error sample files not found.

     

    How can I resolve this issue.

    0
  • Comment author
    Former User of DNAx Community_28

    1) you likely only need to run this on, at most, 12 chromosomes. (12 snps, 1 snp per chromosome)

     

    2) Since you are converting the vcf file to plink in the last stage, You should consider using plink for all steps in this extraction.

     

    You would need to do the following:

    a) take the bgen file and convert it to plink format.

    b) extract the snps you want

    c) list the output plink files into a filelist and merge the individual chromosome plink files into a single plink file

    d) delete the chromosome working files.

     

    This is off the top of my head right now, so I do not guarantee that it will work perfectly. It is for illustrative purposes

     

    " plink2 --bgen ukb_imp_chr${CHR}_v3.bgen ref-first --sample ukb48065_imp_chr1_v3_s487296.sample \

    --make-pgen --out ukbi_ch${CHR}_v3 ;  \

    plink2 --pfile ukbi_ch${CHR}_v3 --extract your_12_snps.txt --make-pgen ukbi_ch${CHR}_12snps ; \

    ls *_12snps.pgen | sed -e 's/.pgen//g'> files_to_merge.txt; \

    plink2 --pmerge-list files_to_merge.txt pfile --make-bed --out ukb48065_12snps_merged; \

    rm files_to_merge.txt; rm ukbi_ch${CHR}_v3* "

     

     

     

    0
  • Comment author
    Former User of DNAx Community_93

    Thank you this works

    0

Please sign in to leave a comment.