Is it expected that some WGS GraphTyper pVCF files have no variants?

Permanently deleted user
/Bulk/Whole genome sequences/Whole genome GraphTyper joint call pVCF/ukb23352_c21_b82_v1.vcf.gz , for example.   Also, is there a file available that describes what region is in each block like there is for the exome data (https://biobank.ndph.ox.ac.uk/showcase/refer.cgi?id=837) ?   Much appreciated!

Comments

6 comments

  • What would I use to analyze WGS GraphTyper pVCF intron variants. Would Swiss Army Knife do it? I want to know which eids have 3 to 5 CATC-repeats at rs56041637

    0
  • Comment author
    Chai Fungtammasan DNAnexus Team

    @Steven Lehrer? bcftools inside swiss-army-knife would be a good start.

     

    If you have a question that is different from the main one, you may create a new question thread in the future. This would make the questions easier to be seen by the community members who might have the same questions or could help with your questions.

    0
  • Can you tell me how to make the docker image and what the command line should be

    0
  • Comment author
    Ondrej Klempir DNAnexus Team

    Rob, there is a file for every 50kb of the genome, and since some regions do not have variants, those files are empty.

     

    You can see which variants belong to which regions in the qc_metrics_graphtyper_v2.7.1_qc.tab.gz file that is dispensed in the QC subfolder of field #23352.

    0
  • I already know which variants belong to which regions from https://www.ncbi.nlm.nih.gov/snp/rs56041637#variant_details

    I need to locate the pVCF UKB23352_c19 file that has position chr19:17642033-17642056. There are 1000 files. How do I do it?

     

    0
  • Comment author
    Chai Fungtammasan DNAnexus Team

    answered in a separate thread

    0

Please sign in to leave a comment.