Genotyping: merging .bed, .bim, and .fam files from array data

Edited 07 July 2024 07:32
2 comments

Hi,

I am facing some issues while executing the command line interface to merge the binary files from array data from (Genotype calls). I used the following code

#!/bin/sh

run_merge="cp /mnt/project/Bulk/Genotype\ Results/Genotype\ calls/ukb22418_c{1..22}_b0_v2* .; ls *.bed | sed -e 's/.bed//g' > files_tp_merge.txt; \
> plink --merge-list files_to_merge.txt --make-bed --autosome-xy --out ukb22418_c1_c22_v2_merged; rm files_to_merge.txt;"

dx run swiss-army-knife -iin="/Data/main_wes_450k.phe" -icmd="${run_merge}" --tag="Step1" --instance-type "mem1_ssd1_v2_x16" --destination="/Data/" --brief --yes

When I execute the above line of code in Git Bash, I always get the following ResolutionError.

dxpy.utils.resolver.ResolutionError: Could not find a project named “C”

I directly typed the code in Git Bash also tried saving the shell script in the PATH directory and call ‘sh partB-merge-files-dxfuse.sh’, but it has the same error message.

I am not sure where I made an error.

Comments

2 comments

Rachael W UKB Community team Data Analyst
- 07 July 2024 14:51
Hi, I am not a geneticist, so this might not be relevant, but I notice a typo in the file name:
files_tp_merge.txt which I think should be files_to_merge.txt .
If you run just the first part, can you see a resulting .txt file in your main project storage or in your instance storage?
If you can follow python code, this page might help https://github.com/dnanexus/dx-toolkit/blob/master/src/python/dxpy/utils/resolver.py
Could you try specifying the project path directly, instead of using the /mnt/project/ .
Are you following a particular tutorial?
Are you using a RAP instance JupyterLab terminal or commands on your own computer? UK Biobank has recently announced that all UKB data must be used within the RAP, and not downloaded. See https://community.ukbiobank.ac.uk/hc/en-gb/community/posts/19996604847133-Changing-the-way-UK-Biobank-data-is-made-available-to-researchers-around-the-world and https://community.ukbiobank.ac.uk/hc/en-gb/categories/19950747200413-UK-Biobank-Research-Analysis-Platform-UKB-RAP The genotyping data in 224418 used to be downloadable, but it may have changed recently. (The WES data has been RAP-only for a long time).

1
Tamrat Befekadu Abebe
- 08 July 2024 08:02
Dear Rachael W,
Thank you so much for the help. I realized the typo error for files_to_merge after I posted my query. Even if I addressed the typo error, I could not manage to get the output I want. However, as you stated, when specify the project path directly, it worked.
I used JupyterLab terminal to build the phenotype data I needed for my study. In addition, I also use command line interface (on my own computer) to work on genotype data like 224418 based on the tutorials dnanexus provides. For the time being, they still allow you to download the phenotype data you built using Spark JupyterLab to your local machine.
Thank you also for the heads up on data access conditions.

0

Please sign in to leave a comment.