When connecting to a job via SSH, is there any way to manually send a "Done" signal instead of terminating it with dx terminate ?

I'm facing an strange issue where some of my jobs tend to get stuck in a worker, without sending any error. I've aleady ask about this to the support, but while they are trying to help me figure this out, I still need to find a way to get my results as soon as possible.   So I've tried to generate the missing results manually inside the worker via SSH connection directly. But now I'm in the worker, I've got all the necessary outputs generated and ready to be uploaded, but once it's done, is there a way to tell the worker that the job is done WITHOUT using dx terminate ?   Thanks for your help !

Comments

5 comments

  • Comment author
    Ondrej Klempir DNAnexus Team

    Can you automate this process and submit the upload commands plus dx terminate as a bash script/subprocess, so the dx terminate will wait till the upload part is finished?

     

    I tried e.g. the following sequence of commands - oneliner:

     

    dx upload file1.txt; dx upload file2.txt; dx terminate $DX_JOB_ID

     

    And it worked as I expected.

    0
  • Hello,

    The issue is not about uploading the outputs, it's more about the "terminate" signal itself.

    When I terminate a job, the parent job is automaticaly failed, and I would like to avoid this.

    0
  • Comment author
    Chai Fungtammasan DNAnexus Team

    I think this issues have better be resolved by the support team because they could see the structure of your analysis and log file. We have have the same access level as other community members, and it would take us quite some time to figure out what you try to do.

     

    0
  • I was afraid so, yes. I'm quite stuck with my analysis and the support team has been aware of my issue for a while so I was merely trying to find an alternative way of keeping things going in the meantime !

    Thank you anyway, I'll just be patient !

    0
  • Comment author
    Chai Fungtammasan DNAnexus Team

    I can share some thought if you haven't got the response. From what I know, I incline to say that this is either not possible or against best practice, but you can see if support team would know more about this.

     

    The success or fail status of the job came from running the src/code.sh or src/code.py within src folder that is pointed to by dxapp.json. I guess if you run job with sleep command, ssh into it, generate expected output, then when the sleep command finish, you should get 'done' status for the job. However, if the error has already occur and non-exit zero status is passed to job (like when you run --debug-on), I don't think it's possible to change status from "failed/terminated" to "done"

     

    The main question is that what use case would require this kind of operation. Interactive type of analysis should be done with interactive job. If you just want the output from non-interactive job, you can ssh into it under --debug-on mode and upload output to project. It wont' matter if the job status is terminated, failed, or done. I would say that most likely use case is to recover from some kind of failure as part of workflow, so you can keep things going. Even that, it's still not a good flow though. You can edit the workflow to start from where you want to start.

    0

Please sign in to leave a comment.