How to download NMR metabolomics data?

28 June 2023 00:00
6 comments

Hello, I am having trouble downloading the NMR metabolomics data. From what I can tell, I need the list of metabolite column names. I tried downloading them from "Data Explore" app but I can only download 30 at a time which is tedious..... Does anyone have the list of metabolomic column names. It would be greatly appreciated!

Comments

6 comments

Rachael W UKB Community team Data Analyst
- 29 June 2023 12:16
There is a Showcase Resource 3543 which has the field numbers and names for all 249 metabolites.

To use it on the RAP, you would need to add p to the field number, and suffixes _i0 and i1.

For example, field 23474 3-Hydroxybutyrate on Showcase corresponds to p23474_i0 and p23474_i1 in the RAP.

2
Rachael W UKB Community team Data Analyst
- 29 June 2023 12:21
Note that a project needs to be Tier 2 or Tier 3 to Download the NMR metabolomics fields from the RAP (or via a Showcase basket), though the fields can be used Online by projects with Tier 1 access. See the Cost Tier for eg field 23474 which says d2 o1 s2

0
Rachael W UKB Community team Data Analyst
- 29 June 2023 12:28
Have you seen the Table Exporter tool https://dnanexus.gitbook.io/uk-biobank-rap/working-on-the-research-analysis-platform/tools-library ?

See also https://community.dnanexus.com/s/question/0D5t000004SBm0eCAD/query-of-the-week-1-export-phenotypic-data-to-a-file

0
Alexandra Lee DNAnexus Team
- 29 June 2023 21:30
Just to add to Rachael's already very helpful comments.

You can probably follow the steps for extracting proteomics data seen here: https://github.com/dnanexus/UKB_RAP/blob/main/proteomics/0_extract_phenotype_protein_data.ipynb

But you'll need to replace a few lines of code in the "Get field names" section with the following:

```
# Get all field names for metabolomics data
field_names = list(
data_dict_df.loc[
data_dict_df["folder_path"] == "Biological samples > Blood assays > NMR metabolomics", "name"].values
)
print(len(field_names))

# Get only instance 0 samples
[f for f in field_names if f.split("_")[-1] == "i0"]

field_names_str = [f"participant.{f}" for f in field_names]
field_names_query = ",".join(field_names_str)

```
Then you can export the `field_names` and use this in the Table exporter app that Rachael mentioned or you can use `dx extract_datatset` following the code in the rest of the notebook

0
Former User of DNAx Community_44
- 29 June 2023 21:51
Hello Rachael thank you.

For future reference I have made a list of field titles for all metabolites. Note that I replaced internal commas within field titles with \, to avoid errors. The complete list can be found here: https://github.com/ibishof/Omics_pipeline/blob/main/ukbiobank/metabolite_field_names

These Field Titles can be used with table explorer to query the data.

1
Weiming Liang
- 01 April 2025 13:31
Thank Lee for his valuable comment.
The "# Get only instance 0 samples" does not work with
[f for f in field_names if f.split("_")[-1] == "i0"]
I modified it to the codes below.
[f for f in field_names if f.endswith('_i0')]
Remaining job needs to be done using Table Exporter.

1

Please sign in to leave a comment.