GLnexus error message

Zeid Kuzbari

Hello,

I have plans to perform joint-genotyping of our data (≥20k cases) with the UKB data (≥200k controls) in a WES case-control analysis. Since GLnexus does not scale very well with large cohorts, I figured the most cost-efficient way of doing it would be taking the UKB pVCFs, filter the 200k I need and then finally joint-genotype with my own pVCF file containing the 20k cases. This I think should be easier than using individual gVCF files especially since pVCF files are already segmented into ~1000 chunks. In an attempt to test one segment to extrapolate the cost of running all of them, I tried using a UKB pVCF file with 200k UKB samples, and another pVCF file with 20k non-overlapping UKB samples. I used my own bed file which targets the gene-coding regions, set all options as default and picked an instance with high memory. After running GLnexus for about 10 hours, I get the following error message:

[GLnexus] [error] Failed to discover alleles: IOError: exception deserializing BCF bucket (capnp/arena.c++:127: failed: Exceeded message traversal limit. See capnp::ReaderOptions.

stack: 55898b589b10 55898b0519f1 55898b060633 55898b060795 55898b01e83a 55898b0136eb 55898af8e5aa 7f4c3a33834e 55898b0120ce 55898b011ce2 55898af98f4e 55898b62595f 7f4c3a32fea6 7f4c3a24fa2e)

Failed to read from standard input: unknown file type

I haven't been able to debug it. Did anyone come across this issue? Any ideas how to solve it?

 

Many thanks and best regards,

Zeid
 

 

 

 

Comments

0 comments

Please sign in to leave a comment.