WGS quality control (read depth specifically)

Kaitlyn Mary Price

Hello, we have started to perform additional WGS quality control on the UKBB. We noticed there is no field for read depth or QUAL (only ‘.’) in the pvcfs. I understand a read depth (DP) < 1 was already removed. Like the WES UKBB, I was hoping to further remove low quality variants using DP < 7 for SNPs and DP < 10 for indels. Is this a common practise for WGS? Does anyone have any tips on how to do this? I was looking into reannotating DP from a single sample vcf. Thanks so much for the help!

Comments

1 comment

  • Comment author
    Federico Murgia

    Hi there, thanks for reaching out!

    For UKBB WGS, it’s expected that the PVCFs lack per-site DP and QUAL fields — the joint-calling pipeline typically strips these because site-level metrics aren’t always meaningful after aggregation, and the genotype-level fields (e.g., GP/DOSAGE) are the primary intended outputs.

    Applying additional depth-based filtering using per-sample DP is definitely a common practice in WGS QC, but it typically requires going back to the single-sample gVCFs or CRAMs/BAMs, since the PVCFs don’t retain the DP field. If you want to use thresholds like DP < 7 for SNPs and DP < 10 for indels, re-annotating sites from single-sample VCFs is the right approach. Tools like bcftools, GATK’s VariantAnnotator, or even Hail can help re-extract depth information and generate updated annotations.

    A couple of tips:
    • If you can access the single-sample gVCFs, you can pull out DP for each variant and merge it back into the pVCF.
    • Alternatively, if you only have CRAMs/BAMs, you can compute depth directly with tools like samtools depth or mosdepth.
    • It’s also useful to inspect depth distributions first — WGS depth tends to be much more uniform than WES, so thresholds sometimes differ across pipelines.

    Hope that helps,

    Best,

     

    Federico

    1

Please sign in to leave a comment.