How to access patient diagnoses among subset for which OLINK proteomics data exists?

I?m not sure if this is the kind of thing you can help me with and apologies if it's naive, but I?ll ask anyway.

 

Given that the group of UKB participants for whom OLINK plasma proteomic data exists is a subset of the total number of participants (around 10%: ~55,000 out of ~500,000) ? I am particularly interested in knowing how many of that subset have a particular diagnosis, for example unipolar depression. Judging by a conservative population estimate of depression prevalence around 10%, I might expect that ~5000 of those patients have been diagnosed with depression or have had a depressive episode at some point in their life (if the subset is representative of the entire cohort). The exact number that have a particular diagnosis will determine whether it is worth it for my research group to actually pay to get access to this data. Does that make sense? Is there any way I can look up this kind of thing from the existing data without applying and paying Tier 2 pricing to get full access to the proteomics data ?

 

Thank you!

Comments

7 comments

  • Comment author
    Rachael W The helpers that keep the community running smoothly. UKB Community team Data Analyst

    It is not possible to discover the cross-tab(s) until there is an active project.   However, UKB data analysts could provide an estimate for you.  Please note that we would not release numbers where a group has fewer than a minimum number of participants, to avoid releasing possibly-identifiable information into the wider community.

    Previously, researchers requiring cross-tabs have contacted AMS Access team, but if you post your question here we could reply to your AMS-registered email.

    UKB data comes from several different sources, including online questionnaires, linked NHS hospital records, and an interview at initial assessment.   There are also some summary fields that attempt to combine the different sources.    It is important that you spend significant time understanding the data, as described in Showcase https://biobank.ndph.ox.ac.uk/showcase/index.cgi .   In particular, the definition of ?depression? is not simple.  A search for ?depression? in Showcase Search brings up 50 fields in several categories.   If you would like to provide the exact SQL query you need, we could run it for you.  Otherwise, please ensure that you have specified which fields and what dates you are interested in. 

    The OLINK data is more straightforward, but you might still want to think about whether you only want participants with valid results for all proteins.

    0
  • Comment author
    Rachael W The helpers that keep the community running smoothly. UKB Community team Data Analyst

    Showcase Resource 4657 specifies that 6500 olink participants were non-random selections by the consortium, and that some of batch 7 were non-random Covid-19-study samples. I believe this means that more than 42000 of the olink participants were a randomly-selected subset of the UKB cohort.

    0
  • Comment author
    Rachael W The helpers that keep the community running smoothly. UKB Community team Data Analyst

    .... but the UKB cohort is not a randomly-selected subset of the UK population.

    0
  • Comment author
    Former User of DNAx Community_6

    Thank you for your detailed answer!

    If it's not too much trouble, I would love to know the cross-tabs for Data-Field 20126 Description:Bipolar and major depression status (https://biobank.ndph.ox.ac.uk/showcase/field.cgi?id=20126)

    Specifically, how many OLINK participants fall into each of the possible answer categories (provided the number is not too low to be identifying)?

    I realize that there are other data fields that might provide a more complete picture to my question, but this would be a good starting point.

     

    0
  • Comment author
    Rachael W The helpers that keep the community running smoothly. UKB Community team Data Analyst

    Hi Elias, sorry about the delay: I've been ill. I see that you sent an email to Access, and one of my colleagues has answered your query.

    0
  • Comment author
    Former User of DNAx Community_6

    hope you?re feeling better! yes my question has been answered! thank you

    0
  • Comment author
    Rachael W The helpers that keep the community running smoothly. UKB Community team Data Analyst

    See also the Nature paper published today, which says that 46595 of the Olink cohort were randomly selected.

    0

Please sign in to leave a comment.