Query of the week #4: Working with JSON and jq to get more details about your cohort(s)
Hi folks!
It is a new Friday, it is a new Query of the week. Sometimes I run into a problem how to debug my created cohorts. Knowing some tricks would be handy, so I have decided to show you some of it.
We are going to work with a cohort I created in Cohort Browser. I am planning to use the cohort from Query of the week #2.
After we save our cohort to the project, we can click on the object and look at the details via graphical interface. However, if you want to get much much more, programmatic approach is what you are looking for. You will get a lot of useful information.
Here, the magic is to make a so called API call. By creating an API call, you can control almost every functionality which is available on the platform. The details and API specification can be found at
https://documentation.dnanexus.com/developer/api
We will demonstrate API call structure in the following example:
dx api record-ABCDEFGHI123456 describe '{"fields":{"details":true}}'
- dx api - "dx api" command is part of the dx-toolkit
- record-ABCDEFGHI123456 - object of interest, i.e. cohort-id
- describe - API method we want to apply to the cohort
- '{"fields":{"details":true}}' - API JSON payload specifies which fields we want to obtain as output
JSON is a key/value format which specifies object property [1].
Let's finish with the theory and now look at the example commands/queries (try it yourself!). Below you will find some illustrative examples on what types of information can be obtained about our cohort:
a) dx api record-ABCDEFGHI123456 describe
This returns a basic set of data in JSON format about our cohort.
b) dx api record-ABCDEFGHI123456 describe '{"fields":{"details":true}}'
Same as in a), but in this case we get much more information as we used "details":true.
c) dx api record-ABCDEFGHI123456 describe '{"fields":{"details":true}}' | jq .details.sql
Same as in b), but this time we parsed the JSON results using jq. jq is like sed for JSON data - you can use it to slice, filter, map and transform structured data [2]. You will need to first install jq on your system.
jq supports filters
? .[0] - selects the first item in a JSON array
? | passes into the next filter
? { } - constructs a JSON object
With this, we extracted the SQL query by which our cohort is represented in Cohort Browser.
?d) dx api record-ABCDEFGHI123456 describe '{"fields":{"details":true}}' | jq .details.filters.pheno_filters
Similar query like in c), it will show us data about the Cohort Browser filters.
If you are interested in this topic and you want to get a comprehensive tutorial on JSON and jq on DNAnexus, I definitely recommend reading Ted's ({@005t00000089ohDAAQ}? ) chapter [3].
Hope you find it useful :).
References
[1] https://www.json.org/json-en.html
[2] https://stedolan.github.io/jq/
[3] https://laderast.github.io/bash_for_bioinformatics/06-JSON.html
Comments
2 comments
There is also a possibility to call directly "dx describe", instead of "dx api ... describe". "dx describe" will not return a JSON object. If you would like to get a JSON object, you will need to add --json:
dx describe record-ABCDEFGHI123456 --details --json
https://documentation.dnanexus.com/user/helpstrings-of-sdk-command-line-utilities
Which approach do you prefer and why?
Thanks so much Ondrej,
One of the things I'm interested in doing is being able to re-run an analysis after adjusting the input and this is a great trick that I can use for updating the json and re-submitting the job!
I'd also be curious to learn how other folks are using these json files.
Please sign in to leave a comment.