Parallelising on UKB‑RAP: How to balance speed and cost

When you run analyses on UKB-RAP, you can choose how your compute is set up. Two settings can significantly impact time and cost: how many jobs you run in parallel, and whether you choose spot or on-demand instances.

This guide explains how to think about both, so you can make informed decisions that fit the structure of your analysis.

For an introduction to instances, read our guide: Understanding cloud computing, virtual machines and instances.

Parallelisation strategy: why it matters

Parallelisation shapes how reliably your workflow runs and how much it costs.

If you bundle many tasks into a single job, they succeed or fail together. One failed task can stop the whole job, and you may need to rerun the entire workflow. If you instead split those tasks across multiple jobs, a failure affects only a subset. The remaining jobs can complete and produce usable outputs.

Each virtual machine (VM) on UKB-RAP is billed according to three factors, the instance type, the amount of time it is used, and the the price stated on the Rate Card. The total cost therefore depends not only on how long your analysis runs, but also on how many jobs you launch.

Running more jobs in parallel usually reduces elapsed time. However, it can increase overall cost, because total compute time is likely to increase because each job incurs its own setup period. This includes time for the VM to boot and load tools (often around three minutes), followed by data staging and indexing, which can vary depending on the workflow.

Parallelisation visualised

The below graph illustrates the practical trade-off between speed and cost as you increase the number of parallel jobs on UKB-RAP.

Job completion time (minutes) and cost (£) against the number of parallel tasks, assuming:

100 participants (P)
Setup = 5 min per task instance
Run = 2 min per participant
Cost rate: mem1_ssd2_v2_x4 - £0.16/hour ⇒ £0.00266 per minute
- Rate Card

As the number of jobs rises, completion time (blue line) falls sharply at first, then levels off. In other words, the biggest gains in turnaround time tend to come from moving away from a single job to a modest degree of parallelisation. Beyond that point, additional jobs deliver progressively smaller improvements.

At the same time, total cost increases roughly linearly (red and green lines), because each job runs on its own virtual machine and incurs its own setup and billing time. The gap between on-demand and spot pricing shows how instance choice can affect overall spend.

Modelling cost vs speed

To illustrate the trade-off, consider a simple workload:

P = 100 participants
S = 5 minutes setup per job
t = 1 minute per participant

If you run the workload with N parallel jobs, two things change:

T = Elapsed time (time to completion)
C = Total compute time billed (sum of minutes across all virtual machines)

Calculating the elapsed time for a workload can be performed by:

In this example, this would look like:

Calculating the compute time billed is available from:

In this example, this would look like:

Cost follows total compute time multiplied by the instance price defined in the Rate Card.

For example, using the virtual machine instance type mem1_ssd2_v2_x4:

On-demand: £0.1616 per hour
Spot: £0.0848 per hour

To summarise, the instance type determines the hourly rate; the number of jobs determines how many machines are running; and the degree of parallelisation determines how elapsed time and total billed time diverge.

Spot instances: the best way to cut compute costs

The UKB-RAP Rate Card lists spot prices for most instance types, typically 40–60% cheaper than on demand.

Spot instances provide the same hardware as on-demand instances, but can be reclaimed (terminated) by the underlying cloud provider.

Why use them?

Lower hourly cost: for example, mem1_ssd2_v2_x4 is £0.0848/hour on spot compared with £0.1616/hour on demand. Over many jobs, that difference compounds.

Well suited to restartable tasks: tools such as Swiss-Army-Knife are often used to run discrete shell commands that can be re-executed without difficulty. If a spot instance is reclaimed, the job can simply be rerun.

Project storage is not overwritten by default: UKB-RAP does not overwrite files in project space unless explicitly instructed. This makes most workflows naturally restartable, as completed outputs remain in place.

Drawbacks & mitigations

Jobs can be reclaimed: The probability of reclamation increases with runtime. Short tasks therefore carry a much lower practical risk of interruption. Where possible, a good elapsed time to aim for an individual workload is 15 minutes.

Cost–speed comparison (on demand vs spot)

N jobs	Elapsed Time (min)	Compute Minutes	Cost (On‑Demand £)	Cost (Spot £)
1	205	205	0.55	0.29
2	105	210	0.57	0.30
5	45	225	0.61	0.32
10	25	250	0.67	0.35
20	15	300	0.81	0.42
50	9	550	1.48	0.78
100	7	1050	2.83	1.48

Spot provides near 50% savings across the board, consistent with the Rate Card differential.

Notice that in this example:

N=20 hits the ≤15-minute target for spot safety.