faster way to analyse DRAGEN BGEN ??
Dear community,
We ran the REGENIE (v4.1) association analysis on five binary phenotypes using DRAGEN WGS BGEN data (ukb24309), which encoded variants in 16 bits. We provided a list of 27000 variants and set the bsize to 3000 to run. The analysis time for REGENIE ranged from 163 to 171 seconds per block.
- block [1/9] : done (171849ms)
- block [2/9] : done (166482ms)
- block [3/9] : done (164349ms)
- block [4/9] : done (164871ms)
- block [5/9] : done (164758ms)
- block [6/9] : done (164454ms)
- block [7/9] : done (164123ms)
- block [8/9] : done (163913ms)
- block [9/9] : done (164799ms)
We then ran the REGENIE v4.1 analysis on array-imputed BGEN data (ukb22828), which uses 8-bit encoding for variants, while keeping the same phenotype and covariate files, along with a list of the same size of variants. In this case, REGENIE took only 31 to 42 seconds per block to complete.
- block [1/9] : done (42303ms)
- block [2/9] : done (36730ms)
- block [3/9] : done (35912ms)
- block [4/9] : done (39158ms)
- block [5/9] : done (39945ms)
- block [6/9] : done (35389ms)
- block [7/9] : done (33289ms)
- block [8/9] : done (31782ms)
- block [9/9] : done (33929ms)
We attribute the slower performance of REGENIE on the DRAGEN BGEN data to 16-bit encoding for variants.
We are wondering if you have also experienced slow performance with REGENIE on DRAGEN BGEN data.
We would greatly appreciate it if you could share your faster method for processing DRAGEN data.
Thanks
Wei-Yu
Comments
0 comments
Please sign in to leave a comment.