Help avoiding memory issues with table exporter.

06 July 2023 00:00
2 comments

I want to download EHR data from UK Biobank. I have extracted the field names as described here: https://github.com/dnanexus/OpenBio/blob/master/dxdata/getting_started_with_dxdata.ipynb

With the addition of this code once I have completed the "Explore Entity Object" code block.

participant_entity = dataset["participant"]

field_titles = [field.title for field in participant_entity.fields]

field_titles_str = ";;;; ".join(field_titles)

print(field_titles_str)

This produces a list of all field names in the "entity" participant and that I reformatted for table exporter. This totaled around 28k field titles. I tried to set up a table exported query as described here: https://community.dnanexus.com/s/question/0D5t000004SBm0eCAD/query-of-the-week-1-export-phenotypic-data-to-a-file

I received an error regarding running out of memory. I then tried a more most search of just field title that contained "LDL", "HDL", "Cholest", "heart", "Lipoprotein". This resulted in about 1000 field titles but I still received errors regarding memory. Do I need to buy more memory? Does table exported not support large queries? Am I going about this in a fundamentally wrong way? Thank you for the help! It will be greatly appreciated!

Comments

2 comments

Ondrej Klempir DNAnexus Team
- 07 July 2023 04:59
Hello @Isaac Bishof?, have you tried to change the instance type of your job and get more memory? You can select from the list of instance types. Rate card is here: https://20779781.fs1.hubspotusercontent-na1.net/hubfs/20779781/Product%20Team%20Folder/Rate%20Cards/BiobankResearchAnalysisPlatform_Rate%20Card_Current.pdf

0
Former User of DNAx Community_44
- 07 July 2023 23:09
Thank you worked like a charm!

0

Please sign in to leave a comment.