I submitted a batch of 500 jobs to UKB/RAP. The number of running jobs is always 100 and the rest in waiting. Is this defined by UKB/RAP, or is it changeable?
I noticed there is an option " --instance-count" for dx run for spark clusters. Will this be able to define a larger set of jobs in concurrent executions as well?
That is a default limit for how many worker each user can use at a given time to protect users from accidentally spend lots of money by mistake. You can contact ukbiobank-support@dnanexus.com to increase your worker limit.
Note that some type of operation that are accessing the same data or has lot of API load, running too many of them would lead to slow down in performance. I recommend you gradually scale up. You may find this guide useful.
Comments
2 comments
That is a default limit for how many worker each user can use at a given time to protect users from accidentally spend lots of money by mistake. You can contact ukbiobank-support@dnanexus.com to increase your worker limit.
Note that some type of operation that are accessing the same data or has lot of API load, running too many of them would lead to slow down in performance. I recommend you gradually scale up. You may find this guide useful.
https://dnanexus.gitbook.io/uk-biobank-rap/science-corner/guide-to-analyzing-large-sample-sets
Note that there is also a tutorial in our webinar on how to handle large number of job. I set the video to start on this topic.
https://youtu.be/U8QZAGwnUm0?t=1753
Please sign in to leave a comment.