file paths in WDL json files.
Hi, I am trying to run a liftover of array genotypes according to
https://github.com/dnanexus/UKB_RAP/tree/main/end_to_end_gwas_phewas/liftover_plink_beds_tmp
In the import parameter json file I have to make an array of all 22 bed, bim and fam files. I could use the file IDs like "dx://project-xxxx:file-xxxx", but this is quite tedious and error prone to be done by copying from the platform and pasting into the json file for each of the 66 files. Is there a way to access these file IDs programatically?
My current approach is to use file paths, e.g.
"dx://project-xxxx:/Bulk/Genotype Results/Genotype calls/ukb22418_c2_b0_v2.bed"
but dxCompiler gives me an error:
java.lang.Exception: URI contains invalid character: dx://project-GBpqxg0JXPfF3kKG4j14k75k:/Bulk/Genotype Results/Genotype calls/ukb22418_c1_b0_v2.bed
I speculated that this might have something to do with the whitespace in directory names (!) and tried escaping them with '\', but to no avail, this time it says:
[error] Error translating inputs
spray.json.JsonParser$ParsingException: Unexpected character ' ' at input index 171 (line 5, position 56), expected JSON escape sequence:
?"dx://project-xxxx:/Bulk/Genotype\ Results/Genotype\ calls/ukb22418_c2_b0_v2.bed"
Any idea how to solve this?
Thanks!
Comments
5 comments
I have what I think is the same question[1]. No answer so far, but since multiple DNANexus blobfiles can resolve to the same human-readable name by design[2], my guess is that what you and I are requesting is not possible within their current system.
1 = https://community.dnanexus.com/s/question/0D582000000LGlWCAW/specify-a-normal-filename-filepath-for-input-json-using-wdl
2 = https://community.dnanexus.com/s/question/0D582000000Lpb6CAC/how-to-overwrite-preexisting-file-with-the-same-name-and-path-when-using-dx-upload
Thanks for the answer. It does work however in this example:
https://github.com/dnanexus/dxCompiler/tree/develop/contrib/beginner_example
I have a bam file in one of my directories, and
{
"bam_chrom_counter.bam": "dx://project-xxxx:/path/to/NA12878.bam"
}
does work.
although maybe the fact that "dx://project-xxxx:/path/to/NA12878.bam" can be found has to do with the fact that it is in a directory I created myself?
I haven't found a solution to my problem , so maybe James ? your conjecture is right and it can not be done on the system. As a workaround, to avoid the error prone process of clicking on 66 links on the web page and copying and pasting the file IDs, I used the dx-toolkit, specifically
dx describe /path/*.bed | grep ID > bed-ids.txt
to get the IDs in text format.
Cheers,
Georg
For this and similar problems, I usually use "dx find data --name "*.xyz" --brief" to get properly formatted file ids and later use this list of ids in a custom bash one-liner/script (json builder) that will format it to a desired json.
Please sign in to leave a comment.