WDL: what is the correct way to specify a docker/container ? ( The entity container-1234 could not be found )

Former User of DNAx Community_46

16 February 2023 00:00
10 comments

I'm trying to extract variants from a vcf file. My WDL task looks like this:

```

task SEARCH_VARIANTS {

input {

File vcf

File bed

}

command <<<

set -e

set -x

bcftools query --regions-file '~{bed}' -f '[%CHROM\t%POS\t%END\t%INFO/SVTYPE\t%INFO/SVLEN\t%FILTER\t%SAMPLE\t%GT\n]' "${F}" '~{vcfs}' > output.tsv

>>>

output {

File variants = "output.tsv"

}

runtime {

cpu : 1

memory : "1GB"

docker: "quay.io/biocontainers/bcftools:1.16--hfe4b78e_1"

}

```

But everytime I go into the following loop:

```

scatter (vcf in SPLIT_FILE.each_vcf) {

call SEARCH_VARIANTS {

input:

vcf=vcf,

bed=bed

}

````

I get the following error. I think it's related to Docker/container ?

> An input or bundled dependency could not be cloned into the project: ResourceNotFound: Error while cloning objects [file-GPXPz48JG7JFb0V0q10kYzXz]: The entity container-GPfyjJQJ2JpYxGYFb84Jvfg5 could not be found

Project:UKBB (project-1234) Executable:workflow FindSV (workflow-1234) Analysis:FINDSV (analysis-1234)

Error propagated from

Analysis:FINDSV (analysis-1234)Stage:scatter (vcf in SPLIT_FILE.each_vcf) (stage-3)Executable:applet FindSV_frag_stage-2 (applet-1234)Job:scatter (vcf in SPLIT_FILE.each_vcf) (job-1234)Error type:InputError

what does it mean ? How can I fix this ?

Comments

10 comments

Ondrej Klempir DNAnexus Team
- 16 February 2023 10:50
For docker, I prefer saving it first to tar and use the following option:
https://github.com/dnanexus/dxCompiler/blob/develop/doc/ExpertOptions.md#storing-a-docker-image-as-a-file

In this specific case, I am wondering if the issue is the docker spec... Try "dx describe" the file file-GPXPz48JG7JFb0V0q10kYzXz. Beside of bcftools, can you run other commands on the files? such as head etc.?

0
Former User of DNAx Community_46
- 16 February 2023 12:11
{@005t0000006BZL2AAO}?

thanks, I create the docker image . Bcftools works on my local computer:

```
$ docker run f0f39c498d87 bcftools --version
bcftools 1.13
Using htslib 1.13+ds
````

I saved+ uploaded the docker image as a tar , and used the file-id with the following syntax.

```
   runtime {
               cpu : 1
               memory : "1GB"
               docker: "dx://file-12345"
       }
```

but I still get the following error:

````
An input or bundled dependency could not be cloned into the project: ResourceNotFound: Error while cloning objects [file-GPXPz48JG7JFb0V0q10kYzXz]: The entity container-GPfyjJQJ2JpYxGYFb84Jvfg5 could not be found
```

there is no log for the task itself ( "Analyses have not started, no logs present.")

0
Ondrej Klempir DNAnexus Team
- 16 February 2023 12:44
OK, so I think that the issue is related to the file-GPXPz48JG7JFb0V0q10kYzXz itself. It seems to me that the file (is this file included in the exported vfcs list?) cannot be downloaded/accessed.

0
Former User of DNAx Community_46
- 16 February 2023 13:07
@Ondrej Klempir? so is it an error on my side ? How can I debug this ?

FYI I put my workflow in a gist, with the json params masked. https://gist.github.com/lindenb/be95478c8fdc8ca7c74339d413a484be

0
Former User of DNAx Community_46
- 16 February 2023 13:34
I changed my params.json this is the diff
```
"stage-common.bed":{
       "$dnanexus_link":{
               "project":"project-1234",
-              "_path":"Pierre/456.bed",
               "id":"file-7890"
               }
       },
```

and now the error has changed !!(?!) why a extra key in a json object would affect anything ?

```
job script function run_command exited with permanent fail code 255 aaaaaaaaa.list + read F + bcftools query --regions-file home dnanexus inputs input8250745273061749280 20230209.bed -f [%CHROM t%POS t%END t%INFO SVTYPE t%INFO SVLEN t%FILTER t%SAMPLE t%GT n] project-GP3pv5jJG7J84QX723JbjPF9:file-G2bzJZ8JkF6PX01z7gxkFk2F [E::hts_open_format] Failed to open file project-GP3pv5jJG7J84QX723JbjPF9:file-G2bzJZ8JkF6PX01z7gxkFk2F : No such file or directory Failed to read from project-GP3pv5jJG7J84QX723JbjPF9:file-G2bzJZ8JkF6PX01z7gxkFk2F: No such file or directory
````

no it looks like input.bed file is not correctly used (?)

```
--regions-file home dnanexus inputs input8250745273061749280 20230209.bed
```

and a list of full paths saved in a file cannot be used (?)

0
Ondrej Klempir DNAnexus Team
- 16 February 2023 14:00
Instead of ...$dnanexus_link"...:

A) what if you specify it like here: https://github.com/dnanexus/dxCompiler/blob/develop/contrib/beginner_example/bam_chrom_counter_input.json ?

https://github.com/dnanexus/dxCompiler/blob/develop/contrib/beginner_example/bam_chrom_counter.wdl this is the corresponding wdl code to the json above... could you review the syntax in <<<CODE SECTION>>> and compare?

B) What if you hardcode the input instead of providing it in json?

C) Alternatively, what happens if you run the compiled workflow via GUI and specify bed input via graphical input field?

Maybe at this stage of your development, it's probably best to send request to ukbiobank-support@dnanexus.com since the support team could inspect your project and log files.

0
Former User of DNAx Community_46
- 16 February 2023 14:17
changing the json object to a plain string produces the following error ( dx version is dx v0.338.1 )

```
dxpy.exceptions.InvalidInput: i/o value bed needs to be given using DNAnexus links, code 422. Request Time=1676558620.165655, Request ID=1676556929825-975130
Details: {
   "field": "bed",
   "reason": "malformedLink",
   "expected": "not a mapping"
}
```

0
Former User of DNAx Community_46
- 16 February 2023 14:24
I switched back the version from 1.1 to 1.0, same error: dxpy.exceptions.InvalidInput: i/o value bed needs to be given using DNAnexus links.

0
Ondrej Klempir DNAnexus Team
- 16 February 2023 14:30
Good to know, thanks. So if it is a bed file, could you run your bcftools without it? as a testing run?

0
Former User of DNAx Community_46
- 22 February 2023 08:22
The following code worked:

```
   runtime {
               cpu : 1
               memory : "1GB"
      docker: "quay.io/biocontainers/bcftools:1.16--hfe4b78e_1"
       }
```

(Thanks to the B Slavik / UKBB support)

0

Please sign in to leave a comment.