How can I find the common variants in my VCF files?

Permanently deleted user
I'd like to find the common variants in my VCF files using SAK on RAP. For common variants, I mean to be at least 2 VCF files. It does not have to be %100. In other words, variants which I try to find could be common for all VCF files or not. I'm not looking for common variants to all VCF files. It is enough to be a common variant for at least two VCF files.   Is this possible to do with SAK? If yes, are there any suggestions for the command line?   Thank you for your help in advance.

Comments

3 comments

  • Comment author
    Rachael W The helpers that keep the community running smoothly. UKB Community team Data Analyst

    If all else fails, it looks as if bcftools isec could be used, but not with a single command. See https://samtools.github.io/bcftools/bcftools.html#isec.

     

    I think you would need to compare 2 files to start with, save the outputs that are in both files, (= common), also save the outputs that are in one file but not the other (not-common), then compare each of those not-common outputs with the third file , and so on.

     

    I think you would also need bgzip and tabix, which are in SAK.

     

    I don't know how long this would take, or how expensive it would be. There is probably a better way.

    Can anybody else suggest one?

     

    For Genetics queries, you could try asking ukb-genetics@jiscmail.ac.uk

    0
  • Comment author
    Rachael W The helpers that keep the community running smoothly. UKB Community team Data Analyst

    Sorry, I think I posted the wrong link above: instead of ukb-genetics@jiscmail.ac.uk, to join the mailing list please go to https://www.jiscmail.ac.uk/cgi-bin/webadmin?A0=UKB-GENETICS

    0
  • Comment author
    Permanently deleted user

    Hi Rachael W ? ,

     

    Thank you so much for your answer. I tried the bcftool with the following commend in SAK on RAP. It worked but when I want to view the common.vcf file, I received "unknown file type" error.

     

    bcftools isec -n +2 -o common.vcf 1.vcf.gz 2.vcf.gz 3.vcf.gz 4.vcf.gz 5.vcf.gz

    0

Please sign in to leave a comment.