Records in HES inpatient main dataset and ICD10 Diagnoses
Hello everyone!!
I'm trying to find a way to retrieve all episodes from the HES inpatient main dataset along with their corresponding ICD10 diagnoses. I extracted the eid, epistart, epiend, and epidur fields from the HES main using Spark SQL, and used pheno.find_field for the “Diagnoses - main ICD10” and “Diagnoses - secondary ICD10” fields to identify the reasons for hospitalization.
To verify that everything is correct, I checked the number of dimensions for each list and noticed that they are different, whereas I expected them to match. How can this be possible? Am I misunderstanding the data fields?
Below you can see the differences between each field:
1st col: Diagnoses - main ICD10
2nd col: Diagnoses - secondary ICD10
3rd col: Number of epistart dates

Comments
1 comment
Hi Emmanouil,
The example you have provided sounds expected to us.
Every hospital visit has one main diagnosis, you can have multiple secondary diagnoses, as well as multiple epistart dates .
Most hospital visits have only one episode, but occasionally there can be more than one episode in a single admission to hospital.
I hope this provides some context, thank you for using the forum!
Please sign in to leave a comment.