WDL workflow and bash/R outputs in DNAnexus?

Permanently deleted user

I have a WDL workflow that has a task to generate Manhattan plots using an Rscript and a docker container.

```

# QQ and Manhattan plots

task Plots {

 input {

   Array[String] phenotype_names # Format: "<phenotype_name>.regenie". Names of the files produced in join_Output.

   Array[Int] chr_list

   Array[File] file_input

   String folder_name

 

   # Runtime

   String docker

 

 }

 Float regenie_files_size = size(file_input, "GiB")

 

 

 # Plots are produced for each phenotype.

 # For each phenotype, a file containing all of the hits from Step 2 is output.

 # For each phenotype, a file containing a subset of all of the hits where "-LOG10P > 1.3" from Step 2 is output.

 command <<<

   set -euo pipefail

   for file in ~{sep=' ' file_input}; do \

     awk '$12 > 1.3' $file >> ${file%.regenie}_subset.regenie; \

     chmod 777 * \

     mv ${file%.regenie}_subset.regenie .; \

     mv $file .; \

   done

   R --no-save --args ~{folder_name} ~{sep=' ' file_input} <<RSCRIPT

   library('data.table')

   library('qqman')

   args <- commandArgs(trailingOnly = TRUE)

   # This indicates where the plots are going to be stored

   output_dir <- args[1]

   print(output_dir)

   # Now we provide the filenames

   file_paths <- args[2:(length(args))]

   print(file_paths)

   for (file in file_paths) {

       print(file)

     regenie_output <- fread(file)

       regenie_ADD_subset <-subset.data.frame(regenie_output, TEST=="ADD")

       regenie_ADD_subset[,"CHROM"] <-as.numeric(unlist(regenie_ADD_subset[,"CHROM"]))

       regenie_ADD_subset[,"LOG10P"] <-as.numeric(unlist(regenie_ADD_subset[,"LOG10P"]))

       regenie_ADD_subset[,"GENPOS"] <-as.numeric(unlist(regenie_ADD_subset[,"GENPOS"]))

       qq_plot = substr(file,1,nchar(file)-8)

       qq_plot = paste0(output_dir,"/", qq_plot,"_", "qqplot.png")

       print(qq_plot)

       png(qq_plot, width = 6, height = 4, unit='in', res=300)

       p = 10 ^ (-1 * (as.numeric(unlist(regenie_ADD_subset[,"LOG10P"]))))

       print(qq(p))

       dev.off()

       manhattan_plot = substr(file,1,nchar(file)-8)

       manhattan_plot = paste0(output_dir,"/",manhattan_plot,"_", "manhattan.png")

       print(manhattan_plot)

       png(manhattan_plot, width = 6, height = 4, unit='in', res=300)

       print(manhattan(regenie_ADD_subset, chr="CHROM", bp="GENPOS", snp="ID", p="LOG10P", logp=FALSE, annotatePval = 5E-8))

       dev.off()

     }

   RSCRIPT

 >>>

 

 output {

   Array[File] output_plots = glob("*.png")

   Array[File] output_regenie = glob("*.regenie")

 }

 runtime {

       dx_instance_type: dx_instance_type

       docker: docker

       dx_timeout: '24H'

 }

 parameter_meta {

       file_input: {

       help: "The join files from the previous task",

       patterns: [".regenie"],

       stream: true,

       localization_optional:true

       }

   }

}

```

 

However, I'm getting an error when I run it in the RAP platform, stating that the files from this step

 

   set -euo pipefail

   for file in ~{sep=' ' file_input}; do \

     awk '$12 > 1.3' $file >> ${file%.regenie}_subset.regenie; \

     mv ${file%.regenie}_subset.regenie .; \

     mv $file .; \

done

 

are read only

 

[31m[error] failure executing Task action 'run'

java.lang.Exception: job script function run_command exited with permanent fail code 1

/home/dnanexus/meta/commandScript: line 6: /home/dnanexus/mnt/inputbe9e832f-bbef-4034-adf7-84c0371da0b0/filename_subset.regenie: Read-only file system

 

How can I do so that DNAnexus can have access to this file and for the workflow to work??

 

Any help is appreciated.

 

Thank you

 

Comments

2 comments

  • Comment author
    Rachael W The helpers that keep the community running smoothly. UKB Community team Data Analyst

    Would it work to copy the files instead of moving them?

    0
  • Comment author
    Ondrej Klempir DNAnexus Team

    My idea

     I am seeing "stream: true" in the parameter_meta spec for input files. Mounting/Streaming files is typically done via dxfuse, and that is indeed a Read-only file system. In the mv etc. operations part I would try to avoid using the prefix like "/home/dnanexus/mnt/input......." and rather try some other hardcoded prefix and test whether this would work.

    0

Please sign in to leave a comment.