Help avoiding memory issues with table exporter.
I want to download EHR data from UK Biobank. I have extracted the field names as described here: https://github.com/dnanexus/OpenBio/blob/master/dxdata/getting_started_with_dxdata.ipynb
With the addition of this code once I have completed the "Explore Entity Object" code block.
participant_entity = dataset["participant"]
field_titles = [field.title for field in participant_entity.fields]
field_titles_str = ";;;; ".join(field_titles)
print(field_titles_str)
This produces a list of all field names in the "entity" participant and that I reformatted for table exporter. This totaled around 28k field titles. I tried to set up a table exported query as described here: https://community.dnanexus.com/s/question/0D5t000004SBm0eCAD/query-of-the-week-1-export-phenotypic-data-to-a-file
I received an error regarding running out of memory. I then tried a more most search of just field title that contained "LDL", "HDL", "Cholest", "heart", "Lipoprotein". This resulted in about 1000 field titles but I still received errors regarding memory. Do I need to buy more memory? Does table exported not support large queries? Am I going about this in a fundamentally wrong way? Thank you for the help! It will be greatly appreciated!
Comments
2 comments
Hello @Isaac Bishof?, have you tried to change the instance type of your job and get more memory? You can select from the list of instance types. Rate card is here: https://20779781.fs1.hubspotusercontent-na1.net/hubfs/20779781/Product%20Team%20Folder/Rate%20Cards/BiobankResearchAnalysisPlatform_Rate%20Card_Current.pdf
Thank you worked like a charm!
Please sign in to leave a comment.