Question regarding batches

Skh

I've been looking at the pVCF files of e.g. DRAGEN, and have some questions regarding batches and the VCF file-splits.
I saw in category 187 that there are e.g. different sequencing providers, different shipment batch numbers, etc.
Are there any recommendations from the UK Biobank or the community if any of these (or other QC-fields) should be included as covariates in analyses of the WGS data?

Also, in this context, is there any meaning to the 'b'-numbering of the VCF-files (e.g. ukb24310_c22_b288_v1.vcf.gz, ukb24310_c22_b289_v1.vcf.gz, ...) or are they simply consecutive chunks of variants split for size-reasons?
 

Comments

3 comments

Please sign in to leave a comment.