Understanding cloud computing, virtual machines and instances

  • Updated

If you're new to UK Biobank and cloud computing, you may be encountering unfamiliar terms like instances, virtual machines, and spot pricing. This guide explains what they mean, and how to choose the right setup for your research on UK Biobank's Research Analysis Platform (UKB-RAP).


1. What is cloud computing?

Cloud computing lets you use computing power (like servers, storage, and software) over the internet instead of owning and running your own hardware.

On UKB-RAP, all data and analysis happen in the cloud using Amazon Web Services (AWS). You can “spin up” powerful computers (called instances) to run your research without needing your own high-performance machines.


2. What is an instance?

An instance is a virtual server - a remote computer in the cloud that you can use for analysis. Instances come in different sizes and types depending on what kind of work you need to do (e.g. memory-heavy, GPU-accelerated, or standard CPU processing).

When you launch an instance, you’re effectively starting up a virtual machine (VM) - a full operating system environment that behaves like a physical computer.


3. Virtual machines in simple terms

A virtual machine (VM) is a software-based version of a physical computer. It has its own memory, CPU, disk, and operating system, but it lives inside a bigger machine at an AWS data centre.

When you run a job or launch RStudio on UKB-RAP, you're using a VM that exists inside a UK Biobank project space.

The benefits of VMs are:

  • Scalability – Scale up or down depending on your workload
  • Cost-efficiency – Only pay for what you use
  • Flexibility – Use different tools or environments without setup hassle
  • Reliability – Backed by AWS, so uptime and backup are built in

4. Instance types on UKB-RAP: what do the names mean?

On UKB-RAP, instances have names like mem1_ssd1_v2_x16. Here’s how to break that down:

Name part What it refers to
mem1, mem2, mem3 Memory tier – higher numbers = more memory
ssd1, ssd2 Storage type – all use fast Solid-State Drives (SSDs) 
gpu, fpga Special hardware included (e.g. Graphics Processing Units (GPUs))
x16, x48, x64 How many virtual CPUs (vCPUs) the instance has

For example:
mem2_ssd1_gpu_x32 = medium memory, SSD storage, GPU included, 32 vCPUs.


5. How to choose the right instance type

First, think about the type of analysis you’ll be doing:

Your workload involves... Choose...
Machine learning / deep learning gpu instances (e.g. mem2_ssd1_gpu_x32)
Large datasets / memory-heavy jobs High-memory instances (e.g. mem3_ssd1_v2_x64)
Standard statistical analysis General-purpose (e.g. mem1_ssd1_v2_x16)
Many small parallel jobs Multiple smaller instances or batch runs

6. Spot vs. On-Demand instances: what’s the difference?

  Spot On-Demand
Cost Cheaper More expensive
Availability Can be interrupted if capacity is needed elsewhere Always available
Use case Great for large, repeatable, non-urgent jobs Best for long-running, critical, or interactive jobs
Examples Batch processing, testing pipelines Clinical pipelines, interactive RStudio sessions

UKB-RAP lets you set a priority level for your job to indicate whether cost savings (spot) or speed/reliability (on-demand) is more important.


7. What about storage and data transfer costs?

Type of cost Details
Compute Based on the instance type and duration of use
Data storage No charge for UK Biobank data; charges apply only to your own uploaded data
Data egress Transferring data out of the platform (e.g. downloading) incurs a charge

Always check the most up-to-date pricing on the UK Biobank or DNAnexus documentation pages.


8. Financial support available

To support researchers:

  • All users receive £40 credit when they join UKB-RAP
  • Early career researchers and researchers from low- and middle-income countries are eligible for funding
  • Additional support is available through a variety of funding programmes that can be used toward compute, storage, and data egress costs.

Summary

So, in summary, before you select an instance you will want to:

  • Understand your analysis needs (e.g. GPU? memory-heavy?)
  • Match your needs to an instance type
  • Decide on priority level: cost-saving vs reliability
  • Be aware of storage and egress costs
  • Use credits and funding schemes to manage costs

If you’re still unsure which instance to choose, or you’re running into issues launching one, don’t hesitate to contact our team or leave a comment below.

Was this article helpful?

1 out of 2 found this helpful

Have more questions? Submit a request

Comments

0 comments

Please sign in to leave a comment.