Hello, 1) are the plink files for the WGS going to be available soon? 2) I need help to run batch jobs on SAK

I am running the Swiss Army Knife using the GUI. I need to convert a list of files from vcf to plink so I am running the analysis in batch. This is the command I am using:

for file in ukb23352_c21_*vcf.gz; do plink --vcf $file --make-bed --allow-no-vars --extract range /mnt/project/Bulk/Whole genome sequences/Whole genome GraphTyper joint call pVCF/Plink_files/ Markers_recode_Positions.txt.  --out ${file%_*}; done'

I get this error:

Error: --extract accepts at most 2 parameters.

For more information, try "plink --help <flag name>" or "plink --help | more".

 

I tried to point to the file in different ways, but I would get the following error:

Error: Failed to open Markers_recode_Positions.txt.

I need to be able to use that file to extract the SNPs from, but I don?t seem to be able to add it as input file when batch is on (I tried adding it together with the files I need to run the analysis on, but still, it doesn?t get found).

If I don?t use the batch parameter, I am able to add the file, and it can be used in the for loop, without adding the path, as I can include it as a input. However, if I don?t run batch jobs the whole analysis takes too long.

 

Thanks so much

Comments

4 comments

  • Comment author
    Rachael W The helpers that keep the community running smoothly. UKB Community team Data Analyst

    hi Marianna, I asked our BioInformatician about your SAK question, and he suggests:

     

    This  /mnt/project/Bulk/Whole genome sequences/Whole genome GraphTyper joint call pVCF/Plink_files/ Markers_recode_Positions.txt need to be in changed to " /mnt/project/Bulk/Whole genome sequences/Whole genome GraphTyper joint call pVCF/Plink_files/ Markers_recode_Positions.txt"

     

    Hope that helps!

    0
  • Thanks.

     

    I tried using the quotes, but I got the second error: : Failed to open Markers_recode_Positions.txt.

    But to be clear, is there a space between ' " ' and '/mnt' ? I tried without space and it did not work.

    However, I will try and give it another go in case I missed something anyway.

     

     

     

    0
  • Comment author
    Chai Fungtammasan DNAnexus Team

    There should be no space. Did you try this in UI or CLI? It might be escape character issue. Try this in UI might be easier.

    0
  • Yes, I have used the user interface.

    I need to run a batch job on each "ukb.....vcf.gz" file for a chromosome. I can easily upload these files and run a vcf-plink conversion on each of them, in batch.

    However, a part of this analysis would be to simultaneously extract only the SNPs of interest according to Markers_recode_Positions.txt. So I do not need to run a batch job for Markers_recode_Positions.txt, but it needs to be accessible for the analyses on the other input files. I could not find a way to select it as input file to be used this way...

    0

Please sign in to leave a comment.