How to download NMR metabolomics data?

Hello,   I am having trouble downloading the NMR metabolomics data. From what I can tell, I need the list of metabolite column names. I tried downloading them from "Data Explore" app but I can only download 30 at a time which is tedious..... Does anyone have the list of metabolomic column names. It would be greatly appreciated!

Comments

6 comments

  • Comment author
    Rachael W The helpers that keep the community running smoothly. UKB Community team Data Analyst

    There is a Showcase Resource 3543 which has the field numbers and names for all 249 metabolites.

     

    To use it on the RAP, you would need to add p to the field number, and suffixes _i0 and i1.

     

    For example, field 23474 3-Hydroxybutyrate on Showcase corresponds to p23474_i0 and p23474_i1 in the RAP.

    2
  • Comment author
    Rachael W The helpers that keep the community running smoothly. UKB Community team Data Analyst

    Note that a project needs to be Tier 2 or Tier 3 to Download the NMR metabolomics fields from the RAP (or via a Showcase basket), though the fields can be used Online by projects with Tier 1 access. See the Cost Tier for eg field 23474 which says d2 o1 s2

    0
  • Comment author
    Rachael W The helpers that keep the community running smoothly. UKB Community team Data Analyst

    Have you seen the Table Exporter tool https://dnanexus.gitbook.io/uk-biobank-rap/working-on-the-research-analysis-platform/tools-library ?

     

    See also https://community.dnanexus.com/s/question/0D5t000004SBm0eCAD/query-of-the-week-1-export-phenotypic-data-to-a-file

    0
  • Comment author
    Alexandra Lee DNAnexus Team

    Just to add to Rachael's already very helpful comments.

     

    You can probably follow the steps for extracting proteomics data seen here: https://github.com/dnanexus/UKB_RAP/blob/main/proteomics/0_extract_phenotype_protein_data.ipynb

     

    But you'll need to replace a few lines of code in the "Get field names" section with the following:

     

    ```

    # Get all field names for metabolomics data

    field_names = list(

      data_dict_df.loc[

        data_dict_df["folder_path"] == "Biological samples > Blood assays > NMR metabolomics", "name"].values

    )

    print(len(field_names))

     

    # Get only instance 0 samples

    [f for f in field_names if f.split("_")[-1] == "i0"]

     

    field_names_str = [f"participant.{f}" for f in field_names]

    field_names_query = ",".join(field_names_str)

     

    ```

    Then you can export the `field_names` and use this in the Table exporter app that Rachael mentioned or you can use `dx extract_datatset` following the code in the rest of the notebook

     

     

     

     

     

    0
  • Hello Rachael thank you.

     

    For future reference I have made a list of field titles for all metabolites. Note that I replaced internal commas within field titles with \, to avoid errors. The complete list can be found here: https://github.com/ibishof/Omics_pipeline/blob/main/ukbiobank/metabolite_field_names

     

    These Field Titles can be used with table explorer to query the data.

     

     

    1
  • Comment author
    Weiming Liang

    Thank Lee for his valuable comment.

    The "# Get only instance 0 samples" does not work with

    [f for f in field_names if f.split("_")[-1] == "i0"]

    I modified it to the codes below.

    [f for f in field_names if f.endswith('_i0')]

    Remaining job needs to be done using Table Exporter.

    1

Please sign in to leave a comment.