file paths in WDL json files.

23 March 2023 00:00
5 comments

Hi, I am trying to run a liftover of array genotypes according to

https://github.com/dnanexus/UKB_RAP/tree/main/end_to_end_gwas_phewas/liftover_plink_beds_tmp

In the import parameter json file I have to make an array of all 22 bed, bim and fam files. I could use the file IDs like "dx://project-xxxx:file-xxxx", but this is quite tedious and error prone to be done by copying from the platform and pasting into the json file for each of the 66 files. Is there a way to access these file IDs programatically?

My current approach is to use file paths, e.g.

"dx://project-xxxx:/Bulk/Genotype Results/Genotype calls/ukb22418_c2_b0_v2.bed"

but dxCompiler gives me an error:

java.lang.Exception: URI contains invalid character: dx://project-GBpqxg0JXPfF3kKG4j14k75k:/Bulk/Genotype Results/Genotype calls/ukb22418_c1_b0_v2.bed

I speculated that this might have something to do with the whitespace in directory names (!) and tried escaping them with '\', but to no avail, this time it says:

[error] Error translating inputs

spray.json.JsonParser$ParsingException: Unexpected character ' ' at input index 171 (line 5, position 56), expected JSON escape sequence:

?"dx://project-xxxx:/Bulk/Genotype\ Results/Genotype\ calls/ukb22418_c2_b0_v2.bed"

Any idea how to solve this?

Thanks!

Comments

5 comments

Former User of DNAx Community_55
- 23 March 2023 20:02
I have what I think is the same question[1]. No answer so far, but since multiple DNANexus blobfiles can resolve to the same human-readable name by design[2], my guess is that what you and I are requesting is not possible within their current system.

1 = https://community.dnanexus.com/s/question/0D582000000LGlWCAW/specify-a-normal-filename-filepath-for-input-json-using-wdl

2 = https://community.dnanexus.com/s/question/0D582000000Lpb6CAC/how-to-overwrite-preexisting-file-with-the-same-name-and-path-when-using-dx-upload

0
Former User of DNAx Community_62
- 23 March 2023 20:07
Thanks for the answer. It does work however in this example:

https://github.com/dnanexus/dxCompiler/tree/develop/contrib/beginner_example

I have a bam file in one of my directories, and

{
"bam_chrom_counter.bam": "dx://project-xxxx:/path/to/NA12878.bam"
}

does work.

0
Former User of DNAx Community_62
- 23 March 2023 20:12
although maybe the fact that "dx://project-xxxx:/path/to/NA12878.bam" can be found has to do with the fact that it is in a directory I created myself?

0
Former User of DNAx Community_62
- 24 March 2023 13:25
I haven't found a solution to my problem , so maybe James ? your conjecture is right and it can not be done on the system. As a workaround, to avoid the error prone process of clicking on 66 links on the web page and copying and pasting the file IDs, I used the dx-toolkit, specifically
dx describe /path/*.bed | grep ID > bed-ids.txt
to get the IDs in text format.
Cheers,
Georg

0
Ondrej Klempir DNAnexus Team
- 27 March 2023 11:52
For this and similar problems, I usually use "dx find data --name "*.xyz" --brief" to get properly formatted file ids and later use this list of ids in a custom bash one-liner/script (json builder) that will format it to a desired json.

0

Please sign in to leave a comment.