Is there any way (e.g., using samtools/htsget) to access a slice of cram file (corresponding to a single genomic region) on RAP without downloading the whole cram file to a compute instance?

Permanently deleted user

Comments

5 comments

  • Hello,

    We have samtools available in our Swiss Army Knife app. Please refer to the link below for details on how to analyze files using this app.

    https://dnanexus.gitbook.io/uk-biobank-rap/working-on-the-research-analysis-platform/working-with-bulk-data-files#analyzing-files-with-swiss-army-knife

     

    In order to analyze files without first writing them on disk, follow the steps below:

     

    • To launch Swiss Army Knife, navigate to your project and click Start Analysis.
    • Select Swiss Army Knife and click Run Selected.
    • In the Command line textbox, enter samtools command to run and refer the path to the input files by using the prefix /mnt/project (example: samtools view /mnt/project/<path -to-input file>)

     

    More information on swiss army knife can be found here:

    https://ukbiobank.dnanexus.com/app/swiss-army-knife

     

    I hope that helps!

    Thank You

    0
  • Comment author
    Permanently deleted user

    I understand that samtools is available; however, my question is specifically whether samtools (or any other software) can be used to extract only a section of a UK Biobank cram file without downloading the entire cram file to a compute instance.

    0
  • Thank you for your question, based on the following documentation, samtools view allows us to specify a region of interest:

    http://www.htslib.org/doc/samtools-view.html

     

    Please enter the samtools view command in the command line input for the swiss army knife app and refer the path to the input files using the prefix /mnt/project to process the files without downloading the files.

    0
  • Comment author
    Ondrej Klempir DNAnexus Team

    Hi!

     

    Command in the following format worked for me and this extracted just one chromosome from CRAM without downloading the entire CRAM and reference genome:

     

    samtools view -b -T /mnt/project/ref_genome.fa "/mnt/project/Bulk/.../.../XYZ_0_0.cram" chr22 > out22.bam

     

     

    0
  • Comment author
    Permanently deleted user

    Thanks; all set.

    0

Please sign in to leave a comment.