Records in HES inpatient main dataset and ICD10 Diagnoses

Hello everyone!!

I'm trying to find a way to retrieve all episodes from the HES inpatient main dataset along with their corresponding ICD10 diagnoses. I extracted the eid, epistart, epiend, and epidur fields from the HES main  using Spark SQL, and used pheno.find_field for the “Diagnoses - main ICD10” and “Diagnoses - secondary ICD10” fields to identify the reasons for hospitalization.

To verify that everything is correct, I checked the number of dimensions for each list and noticed that they are different, whereas I expected them to match. How can this be possible?  Am I misunderstanding the data fields?

Below you can see the differences between each field: 

1st col: Diagnoses - main ICD10    

2nd col: Diagnoses - secondary ICD10    

3rd col: Number of epistart dates

 

Comments

1 comment

  • Comment author
    Bethan Data Analyst

    Hi Emmanouil,

    The example you have provided sounds expected to us. 

    Every hospital visit has one main diagnosis, you can have multiple secondary diagnoses, as well as multiple epistart dates . 

    Most hospital visits have only one episode, but occasionally there can be more than one episode in a single admission to hospital. 

    I hope this provides some context, thank you for using the forum!

    0

Please sign in to leave a comment.