Dictionary no longer contains showcase category
Previously, the UK Biobank provided a data dictionary at https://biobank.ndph.ox.ac.uk/~bbdatan/Data_Dictionary_Showcase.csv (now a defunct link).
Now, the method to obtain the dictionary and coding lookups is to run:
dx extract_dataset project-xxxx:record-yyyy -ddd --delimiter “,”However, the dictionary provided by this command is missing a key piece of data that used to be provided:
- It does not contain the numerical Category (e.g., Category 100071 for the Verbal Interview). This is an important way to group similar field_id values that is currently missing.
- More minor, it does not cleanly map one field_id to one coding_file_id (because, for instanced fields, it may have multiple rows). This can be worked around programmatically.
Is it possible for the UK Biobank to go back to including the showcase Category identifier in the data dictionary?
Comments
1 comment
Hi James,
In a RAP project, the ‘Showcase Metadata’ folder contains valuable information. For this example, I recommend using the field.tsv file and joining it with the catbrowse.tsv file on the ‘main_category’ and ‘child_id’ columns. This will allow you to filter for Category 100071 using the parent_id and can identify the related field_ids.
I am unclear on the minor issue you have raised, could you please expand on this?
I hope this helps. Thank you for using the community forum.
Please sign in to leave a comment.