Collecting data
Information about health outcomes in UK Biobank comes from various places, including:
1. Self-reported data collected at the UK Biobank assessment centre visits
2. Linkage to electronic health records:
a. primary care records
b. inpatient hospital records
c. cancer registrations
d. death registrations
3. Self-reported data collected from online questionnaires for conditions that are often not
captured in health records (e.g., mental health, gastrointestinal conditions, pain, etc.)
Primary care data
Primary care data for ~230,000 UK Biobank participants (up to 2016 or 2017 depending on the data supplier) was made available in 2019. This dataset contains data from the GP system suppliers and contains coded clinical events (including consultations, diagnoses, procedures and laboratory tests), prescribed medications (including prescription date, drug code and, where available, drug name and quantity) and a range of administrative codes (e.g. referrals to specialist hospital clinics). The data are coded using READ2, CTV-3, BNF and DM+D.
Due to the withdrawal of the UK Government Control of Patient Information (COPI) regulation on the 1st July 2022, additional primary care data made available for COVID-19 research is no longer available.
Hospital inpatient data
Hospital inpatient data are available for the full cohort. This provides information on hospital admissions for each participant and includes data on date of admission, diagnosis (and underlying conditions) during admission, procedures and discharge information. These are coded using ICD-9, ICD-10, OPCS-3 and OPCS-4. Please refer to resource 138483 for more information on the inpatient data.
For more details on how the data was collected, mapped and validated, recent changes to the data structure as well as further information on how to access the hospital inpatient data, please refer to the Linked Health Data sections in our Essential Information page on Showcase.
First occurrences of medical conditions
A set of ‘first occurrence’ data-fields have been generated that map the clinical codes from primary care, hospital inpatient admissions, death records and self-reported medical conditions to 3-character ICD-10 codes and provide, for each participant, the date that code first occurred in any source. For more information please see:
Death data
Linkage to national death registries provides notifications of participant deaths (if in the UK), containing data on date and cause(s) of death. Further information can be found in resource 115559. These are coded using ICD-10.
Cancer data
Linkage to national cancer registries provides notifications of cancer registrations and includes data on cancer diagnosis (ICD-9 and ICD-10) and cancer histology code. Further information can be found in resource 115558.
Information on the most common cancers by age, time period and sex can be found in Showcase. The number of prevalent (i.e. occurring before recruitment) and incident (after recruitment) cancer diagnoses by type of cancer can be found in category 100092 of Showcase and on the Essential information page.
Current censor dates for hospital inpatient data, death registry and cancer registry data can also be found in Showcase.
Algorithmically-defined health outcomes
To aid researchers, UK Biobank have generated algorithmically-defined health outcomes using the self-reported health information, hospital inpatient data and death data, providing information on first diagnosis, for each participant, of a small number of health conditions. For more information please see:
- Category 42 – algorithmically-defined health outcomes
- Resource 460 – information on algorithm development
- Resource 594 – code lists for currently available outcomes
Related to
Comments
0 comments
Article is closed for comments.