If you're new to UK Biobank and cloud computing, you may be encountering unfamiliar terms like instances, virtual machines, and spot pricing. This guide explains what they mean, and how to choose the right setup for your research on UK Biobank's Research Analysis Platform (UKB-RAP).
1. What is cloud computing?
Cloud computing lets you use computing power (like servers, storage, and software) over the internet instead of owning and running your own hardware.
On UKB-RAP, all data and analysis happen in the cloud using Amazon Web Services (AWS). You can “spin up” powerful computers (called instances) to run your research without needing your own high-performance machines.
2. What is an instance?
An instance is a virtual server - a remote computer in the cloud that you can use for analysis. Instances come in different sizes and types depending on what kind of work you need to do (e.g. memory-heavy, GPU-accelerated, or standard CPU processing).
When you launch an instance, you’re effectively starting up a virtual machine (VM) - a full operating system environment that behaves like a physical computer.
3. Virtual machines in simple terms
A virtual machine (VM) is a software-based version of a physical computer. It has its own memory, CPU, disk, and operating system, but it lives inside a bigger machine at an AWS data centre.
When you run a job or launch RStudio on UKB-RAP, you're using a VM that exists inside a UK Biobank project space.
The benefits of VMs are:
- Scalability – Scale up or down depending on your workload
- Cost-efficiency – Only pay for what you use
- Flexibility – Use different tools or environments without setup hassle
- Reliability – Backed by AWS, so uptime and backup are built in
4. Instance types on UKB-RAP: what do the names mean?
On UKB-RAP, instances have names like mem1_ssd1_v2_x16. Here’s how to break that down:
| Name part | What it refers to |
|---|---|
mem1, mem2, mem3
|
Memory tier – higher numbers = more memory |
ssd1, ssd2
|
Storage type – all use fast Solid-State Drives (SSDs) |
gpu, fpga
|
Special hardware included (e.g. Graphics Processing Units (GPUs)) |
x16, x48, x64
|
How many virtual CPUs (vCPUs) the instance has |
For example:mem2_ssd1_gpu_x32 = medium memory, SSD storage, GPU included, 32 vCPUs.
5. How to choose the right instance type
First, think about the type of analysis you’ll be doing:
| Your workload involves... | Choose... |
|---|---|
| Machine learning / deep learning |
gpu instances (e.g. mem2_ssd1_gpu_x32) |
| Large datasets / memory-heavy jobs | High-memory instances (e.g. mem3_ssd1_v2_x64) |
| Standard statistical analysis | General-purpose (e.g. mem1_ssd1_v2_x16) |
| Many small parallel jobs | Multiple smaller instances or batch runs |
6. Spot vs. On-Demand instances: what’s the difference?
| Spot | On-Demand | |
|---|---|---|
| Cost | Cheaper | More expensive |
| Availability | Can be interrupted if capacity is needed elsewhere | Always available |
| Use case | Great for large, repeatable, non-urgent jobs | Best for long-running, critical, or interactive jobs |
| Examples | Batch processing, testing pipelines | Clinical pipelines, interactive RStudio sessions |
UKB-RAP lets you set a priority level for your job to indicate whether cost savings (spot) or speed/reliability (on-demand) is more important.
7. What about storage and data transfer costs?
| Type of cost | Details |
|---|---|
| Compute | Based on the instance type and duration of use |
| Data storage | No charge for UK Biobank data; charges apply only to your own uploaded data |
| Data egress | Transferring data out of the platform (e.g. downloading) incurs a charge |
Always check the most up-to-date pricing on the UK Biobank or DNAnexus documentation pages.
8. Financial support available
To support researchers:
- All users receive £40 credit when they join UKB-RAP
- Early career researchers and researchers from low- and middle-income countries are eligible for funding
- Additional support is available through a variety of funding programmes that can be used toward compute, storage, and data egress costs.
Summary
So, in summary, before you select an instance you will want to:
- Understand your analysis needs (e.g. GPU? memory-heavy?)
- Match your needs to an instance type
- Decide on priority level: cost-saving vs reliability
- Be aware of storage and egress costs
- Use credits and funding schemes to manage costs
If you’re still unsure which instance to choose, or you’re running into issues launching one, don’t hesitate to contact our team or leave a comment below.
Comments
0 comments
Please sign in to leave a comment.