WDL "restarton" executionpolicy not being used

I'm still having issues concerning - https://community.dnanexus.com/s/question/0D5t0000048tMbUCAU/how-to-automatically-restart-jobs-upon-spotinstanceinterruption.   I am running WDL workflows and getting errors such as "The machine running the job was terminated by the cloud provider". I'm fairly confident I have the correct execution policy to restart jobs given any error but this doesn't seem to have an effect:   dx describe job-GP3ybZQJg8Jkpj3z5GQyf91v # job that gives the error executionPolicy    {"maxRestarts": 5, "restartOn": {"*": 2}} # has the following executionpolicy   Any help appreciated, many thanks, Barney

Comments

5 comments

  • Comment author
    Ondrej Klempir DNAnexus Team

    If you do dx describe job-GP3ybZQJg8Jkpj3z5GQyf91v, how many failureCounts do you see?

    0
  • I'm replying for Barney because he's locked out of his account:

     

    ? dx describe job-GP3ybZQJg8Jkpj3z5GQyf91v | grep "failureCounts"

    failureCounts {}

    0
  • Comment author
    Ondrej Klempir DNAnexus Team

    Thanks, OK, the job has not been restarted. I am sharing more ideas.

     

    As per dxcompiler ExpertOptions documentation - https://github.com/dnanexus/dxCompiler/blob/develop/doc/ExpertOptions.md#setting-dnanexus-specific-attributes-in-extrasjson - dxCompiler equivalent to setting the runtime options through the dxapp.json file in dnax applets, is the extras file, specified with the -extras command line option.

     

    You can study some examples in this paragraph: https://github.com/dnanexus/dxCompiler/blob/develop/doc/ExpertOptions.md#default-and-per-task-attributes

    0
  • Thanks, that second link was useful - I noticed "restartableEntryPoints" which I've set to "all" to enable all entry points to be restartable. Unfortunately the same error still persists with no restarts... Would I be able to get in contact with any DNANexus staff familiar with scaling WDL's to >1K jobs? We're evaluating using large WDL workflows more broadly but so far have been unable to get them to work.

     

    Many Thanks, Barney

     

    current extras.json:

    {

    "defaultTaskDxAttributes" : {

    "runSpec": {

    "restartableEntryPoints": "all",

    "timeoutPolicy": {

    "*": {

    "hours": 1

    }

    },

    "executionPolicy": {

    "restartOn": {

    "*": 3

    }

    },

    "systemRequirements": {

    "access" : {

    "project": "CONTRIBUTE",

    "network": [

    "*"

    ]

    }

    }

    }

    }

    }

    0
  • Comment author
    Chai Fungtammasan DNAnexus Team

    You may contact our support team as they could look into your project and see what might be the issue with job restart. ukbiobank-support@dnanexus.com

    0

Please sign in to leave a comment.