Best alternative to lofreq in Swiss Army Knife?

Permanently deleted user
Hi everyone,   For our particular research project, we need to be able to call per-individual low frequency variants (i.e., as in from a non-diploid genome). For that, we tend to use lofreq, but it doesn't seem to be available through the Swiss Army Knife.   Is there any way to install lofreq itself ? Otherwise, would any of the already available tools provide a similar functionality?   Cheers, Fran

Comments

7 comments

  • Comment author
    Ondrej Klempir DNAnexus Team

    Some possible solutions could be:

     

    a) Run a web based terminal - ttyd (https://ukbiobank.dnanexus.com/app/ttyd) or Cloud Workstation app (https://ukbiobank.dnanexus.com/app/cloud_workstation). You can install the lofreq tool on the cloud worker, download data and process data in this interactive session.

     

    b) You can build your own applet performing lofreq logic. Instructions on how to create an applet are reviewed here: https://www.youtube.com/watch?v=A_iki_50Ig0

     

    c) If you have some working experience with Docker, I would try to save a docker lofreq snapshot on the platform and use it as a background docker image for Swiss Army Knife (-iimage option). DNAnexus will have Docker webinar coming soon and an example of creating docker snapshot for custom made tool (including Swiss Army Knife) will be shown and presented.

     

    d) -iimage in SAK also can pull a publicly available docker image. I found a docker container for lofreq: https://quay.io/repository/biocontainers/lofreq

     

    In SAK help (see the Bold part below):

     

    $ dx run app-swiss-army-knife -h

     

    usage: dx run app-swiss-army-knife [-iINPUT_NAME=VALUE ...]

     

    App: Swiss Army Knife

     

    A multi-purpose tool for all your basic analysis needs

     

    See the app page for more information:

     

    https://platform.dnanexus.com/app/swiss-army-knife

     

    Inputs:

     

    Input files: [-iin=(file) [-iin=... [...]]]

     

    Command line: -icmd=(string)

     

     

    Optional Docker image identifier: [-iimage=(string)]

     

    Instead of using the default Ubuntu 14.04 environment, the input

     

    command will be run using the specified Docker image as it would be

     

    when running 'docker run image cmd'. Example images identifiers are

     

    'ubuntu:16.04', 'quay.io/ucsc_cgl/samtools'.

     

    Outputs:

     

    Output files: [out (array:file)]

    ------------------------------------

     

    So I would try to directly specify "https://quay.io/repository/biocontainers/lofreq" as the input image for SAK and then just specify the lofreq command you would like to run inside SAK.

    0
  • Comment author
    Permanently deleted user

    Many thanks, @Ondrej Klempir? . I will explore these options and see if we can make any work as expected. I might come back here to ask for further clarifications :)

    0
  • Comment author
    Permanently deleted user

    Just a quick update. To make the docker available online work properly, I needed to launch it as:

     

    dx run app-swiss-army-knife -iimage=quay.io/biocontainers/lofreq:2.1.5--py310h8360dc1_7

     

    Since there is no version marked as "latest", which SAK appears to expect by default

    0
  • Comment author
    Former User of DNAx Community_28

    {@005t0000009gPQNAA2}? 

     

    If you are running swiss-army-knife via a dx run shell script like in the dnanexus gwas github repositories, you can simply add the install lines into your execution script. (see the scripts here: https://github.com/dnanexus/UKB_RAP/tree/main/GWAS/regenie_workflow )

     

    To install lofreq or really any other precompiled binary software package, use these lines:

      run_lofi_cmd="wget https://github.com/CSB5/lofreq/raw/master/dist/lofreq_star-2.1.5_linux-x86-64.tgz; \

       tar zxvf lofreq_star-2.1.5_linux-x86-64.tgz; \

       ./lofreq_star-2.1.5_linux-x86-64/bin/lofreq "

       

      dx run swiss-army-knife -iin="${data_file_dir}/WES_cX_qc_pass.vcf.gz" \

        -icmd="${run_lofi_cmd}" --tag="LOFREQ" --instance-type "mem1_ssd1_v2_x16"\

       --destination="${project}:/data/wes_lofreq/" --brief --yes

     

    Mind you, you cannot just cut and paste all of this into a bash script and execute it. The iin file and the destination directory for the dx command need to exist. ${data_file_dir} is a a directory where I store some processed vcf files. If you fix the input and output files, and then run this as is, it will give a failure because it isn't actually creating any output files. If you look at the log file you will see that it has hactually downloaded the software, uncompressed it, and then run the executable diaplaying the help screen into standard out,

     

    Best of luck

     

    -Phil

     

    0
  • Comment author
    Permanently deleted user

    Thanks for you suggestion, @Phil Greer? 

     

    Would installing need to be done on every single instance for a batch computation? If so, is there any particular benefit compared with using a docker as suggested above?

     

    Cheers,

    Fran

    0
  • Comment author
    Ondrej Klempir DNAnexus Team

    Awesome!

     

    Possibly for any batch processing, please take into consideration that DockerHub and other registries have a pull limit of 200 pulls/user/day. This is the reason I would recommend saving a docker image into a file in your DNAnexus project.

     

    0
  • Comment author
    Former User of DNAx Community_28

    The dx submission scripts can be looped over the files you are running them upon. (ie. each chromosome file, or each individual).

     

    If you look at the swiss -army-knife log file, you will see that each software packages is being downloaded and installed every time you start swiss army knife. The good thing about this is that if the software gets updated, you just have to point the script to the new tarball.

     

    It really just depends on how much up-front work you want to do. It really won't make much difference in the overall runtime.

     

    0

Please sign in to leave a comment.