DNANexus OR/AND operator

I’m trying to create multiple cohorts similar to Hou et al. (2023), including “pure depression”, “pure anxiety”, and “comorbid depression + anxiety”.

To get the “depressed individuals” cohort, I need individuals who are either unenthusiastic OR disinterested, combined with additional criteria (e.g., age ≥ 18). This means I need both OR and AND logic in the same cohort definition.

However, in the DNAnexus cohort browser, I’m unable to combine OR and AND operators within a single filter (i.e., I can’t group conditions like: (depressed mood OR unenthusiastic) AND other criteria).

As a workaround, I created two separate cohorts (“unenthusiastic” and “disinterested”) and merged them using UNION to approximate the OR condition. But I run into issues when extending this:

  • I need to do the same for anxiety
  • Then create a comorbid (depression + anxiety) group,combining the depression and anxiety cohort
  • However, DNAnexus does not allow combining already merged cohorts, so I can’t create the comorbid depression + anxiety group

I’ve checked the documentation, but it doesn’t seem to work for me.

Does anyone know:

  • if there is a way to apply grouped logic (AND/OR) within the cohort browser, or
  • a recommended workaround for creating these cohorts?

Thanks in advance!

Comments

1 comment

  • Comment author
    Rachael W The helpers that keep the community running smoothly. UKB Community team Data Analyst

    Dear Sarah,

    The cohort browser is nice for preliminary analysis, but it has some limitations.   At some point you will need to export some of the relevant data from the Parquet database and perform your main analysis using a different process, such as R or Python notebooks within a JupyterLab instance.

    Some researchers use the Table Exporter tool (see the Tools tab) to export the relevant data subsets to a csv file and then use a JupyterLab instance for further processing.  You could export the necessary fields for the participants in each of the cohorts that you have created so far.  Some researchers prefer to use a Spark JupyterLab and Spark commands to interact with the Parquet database.  There are some template Spark JupyterLab notebooks available at https://github.com/UK-Biobank/UKB-RAP-Notebooks-Access .

    I do not know whether the DNAnexus documentation about Join Filters would be applicable to the UKB-RAP.  The UKB-RAP is only one of several DNAnexus platforms, and in a few ways it has specific differences.   It is possible that one of these differences prevents the Join Filters from working.

    0

Please sign in to leave a comment.