Hello,
I am having trouble downloading the NMR metabolomics data. From what I can tell, I need the list of metabolite column names. I tried downloading them from "Data Explore" app but I can only download 30 at a time which is tedious..... Does anyone have the list of metabolomic column names. It would be greatly appreciated!
Note that a project needs to be Tier 2 or Tier 3 to Download the NMR metabolomics fields from the RAP (or via a Showcase basket), though the fields can be used Online by projects with Tier 1 access. See the Cost Tier for eg field 23474 which says d2 o1 s2
[f for f in field_names if f.split("_")[-1] == "i0"]
field_names_str = [f"participant.{f}" for f in field_names]
field_names_query = ",".join(field_names_str)
```
Then you can export the `field_names` and use this in the Table exporter app that Rachael mentioned or you can use `dx extract_datatset` following the code in the rest of the notebook
Comments
6 comments
There is a Showcase Resource 3543 which has the field numbers and names for all 249 metabolites.
To use it on the RAP, you would need to add p to the field number, and suffixes _i0 and i1.
For example, field 23474 3-Hydroxybutyrate on Showcase corresponds to p23474_i0 and p23474_i1 in the RAP.
Note that a project needs to be Tier 2 or Tier 3 to Download the NMR metabolomics fields from the RAP (or via a Showcase basket), though the fields can be used Online by projects with Tier 1 access. See the Cost Tier for eg field 23474 which says d2 o1 s2
Have you seen the Table Exporter tool https://dnanexus.gitbook.io/uk-biobank-rap/working-on-the-research-analysis-platform/tools-library ?
See also https://community.dnanexus.com/s/question/0D5t000004SBm0eCAD/query-of-the-week-1-export-phenotypic-data-to-a-file
Just to add to Rachael's already very helpful comments.
You can probably follow the steps for extracting proteomics data seen here: https://github.com/dnanexus/UKB_RAP/blob/main/proteomics/0_extract_phenotype_protein_data.ipynb
But you'll need to replace a few lines of code in the "Get field names" section with the following:
```
# Get all field names for metabolomics data
field_names = list(
data_dict_df.loc[
data_dict_df["folder_path"] == "Biological samples > Blood assays > NMR metabolomics", "name"].values
)
print(len(field_names))
# Get only instance 0 samples
[f for f in field_names if f.split("_")[-1] == "i0"]
field_names_str = [f"participant.{f}" for f in field_names]
field_names_query = ",".join(field_names_str)
```
Then you can export the `field_names` and use this in the Table exporter app that Rachael mentioned or you can use `dx extract_datatset` following the code in the rest of the notebook
Hello Rachael thank you.
For future reference I have made a list of field titles for all metabolites. Note that I replaced internal commas within field titles with \, to avoid errors. The complete list can be found here: https://github.com/ibishof/Omics_pipeline/blob/main/ukbiobank/metabolite_field_names
These Field Titles can be used with table explorer to query the data.
Thank Lee for his valuable comment.
The "# Get only instance 0 samples" does not work with
[f for f in field_names if f.split("_")[-1] == "i0"]
I modified it to the codes below.
[f for f in field_names if f.endswith('_i0')]
Remaining job needs to be done using Table Exporter.
Please sign in to leave a comment.