How can I extract AD, AF, DP information from ##FORMAT lines of snpEff.vcf file?

Permanently deleted user
https://community.dnanexus.com/s/question/0D582000001cFXNCA2/how-can-i-convert-vcf-file-to-tabular-data I got an exported table file using the code in the response to the question that link that I share. But there are not informations of the format section of VCF file in the exported table's columns. When I preview the VCF file in the folder in UKB RAP, I can see informations of format section. I share the screenshot of the preview of the VCF file. I want to change on the code in order to the exported table to contains informations of format section too, but I couldn't. Could you help me a little on this? [Image: Ekran Resmi 2023-08-22 10.44.16]

Comments

3 comments

  • Comment author
    Ondrej Klempir DNAnexus Team

    Hello @Burcu Çevik?, if I understand this correctly, this is question more about how to format/rename columns in your dataframe/table to have better representation (or "human readable" if you will), not much about working with variants itself. This sounds like a doable data operation, indeed. I would first export rows that begins with ##INFO string (you can do it in bash using grep) and then it would need a bit of programming work to map that metadata to your existing pandas dataframe (based on order of columns or column names, not sure how you store it) and rename.

    0
  • Comment author
    Permanently deleted user

    You can also use bcftools --annotate to extract information from INFO column of a VCF file.

    0
  • Comment author
    Permanently deleted user

    Many thanks for your suggestion. Is there any command line that you recommend to I use with Swiss Army Knife?

    0

Please sign in to leave a comment.