VEP HAIL Failing Everytime Now

I have a VEP/HAIL notebook for some pLOF analysis. It worked well enough previously but now I get:

Hail version: 0.2.132-678e1f52b999 Error summary: HailException: VEP command 'docker run -i -v /cluster/vep:/root/.vep dnanexus/dxjupyterlab-vep ./vep --format vcf --json --everything --allele_number --no_stats --cache --offline --minimal --assembly GRCh38 -o STDOUT --check_existing --dir_cache /root/.vep/ --fasta /root/.vep/homo_sapiens/109_GRCh38/Homo_sapiens.GRCh38.dna.toplevel.fa.gz --plugin LoF,loftee_path:/root/.vep/Plugins/loftee,human_ancestor_fa:/root/.vep/human_ancestor.fa,conservation_file:/root/.vep/loftee.sql,gerp_bigwig:/root/.vep/gerp_conservation_scores.homo_sapiens.GRCh38.bw' failed with non-zero exit status 125  VEP Error output: Unable to find image 'dnanexus/dxjupyterlab-vep:latest' locally docker: Error response from daemon: pull access denied for dnanexus/dxjupyterlab-vep, repository does not exist or may require 'docker login': denied: requested access to the resource is denied. See 'docker run --help'.

 

This happens every single time I start a compute instance now and it's driving me insane.

Initializing HAIL with:

import hail as hl
from pyspark.sql import SparkSession

hl.init()  # Hail configures Spark for you

spark = SparkSession.getActiveSession()
if spark is None:
   raise RuntimeError("No active SparkSession. Make sure hl.init() ran successfully.")
 

 

Comments

2 comments

  • Comment author
    Dr. Mc. Ninja

    My assistant came up with the below. Sorry cus the formatting is a bit off, and I haven't tested it.
     

    What’s actually going wrong?

    This bit tells the whole story:

    Unable to find image dnanexus/dxjupyterlab-vep:latest locally
    docker: Error response from daemon: pull access denied for dnanexus/dxjupyterlab-vep

    On UKB RAP, you are not supposed to pull dnanexus/dxjupyterlab-vep from Docker Hub. Instead:

    • The DXJupyterLab Spark Cluster app downloads a private VEP image tarball from a DNAnexus project,
    • Loads it with docker load,
    • And then tags it as dnanexus/dxjupyterlab-vep:latest. [community.ukbiobank.ac.uk]

    Hail’s VEP config (vep-GRCh38.json) then calls:

     

    "command": [  "docker", "run", "-i",  "-v", "/cluster/vep:/root/.vep",  "dnanexus/dxjupyterlab-vep",  "./vep", ... ] 

    which assumes that image has already been loaded locally by the cluster bootstrap. [documentation.dnanexus.com]

    Your error means: the cluster never installed VEP, so docker run tries to pull from a remote registry and gets denied.

    Why would that suddenly start happening?

    From the current Spark cluster bootstrap script (the thing that configures the cluster), VEP is only installed when the app is launched with:

    "feature": "HAIL-VEP" 

    not just "HAIL". The logic is literally:

    elif [ "${feature}" == 'HAIL' ]; then    /cluster-adv/third-party/hail/install.sh 0.2.132 elif [ "${feature}" == 'HAIL-VEP' ]; then    /cluster-adv/third-party/hail/install.sh 0.2.132    install_vep; fi 

    So:

    • feature = HAIL → Hail only, no VEP image / cache installed
    • feature = HAIL-VEP → Hail plus VEP image + cache under /cluster/vep and the dnanexus/dxjupyterlab-vep:latest tag is created

    If your new compute instances are:

    • Plain JupyterLab (no Spark Cluster), or
    • Spark Cluster with feature=HAIL (default in some flows),

    then VEP is never installed, and you get exactly the error you pasted.

    Docs still show the vep-GRCh38.json example with dnanexus/dxjupyterlab-vep, but they assume you launched the Spark cluster with the HAIL-VEP feature so the image already exists. [documentation.dnanexus.com]

    How to fix it (step-by-step)

    1. Make sure you’re using the Spark Cluster app with HAIL-VEP

    From the UKB RAP UI:

    1. Go to Apps / Tools and start “DXJupyterLab Spark Cluster” (not the simple “DXJupyterLab” app).
    2. In the app inputs:
      • Set Feature to HAIL-VEP (this is the critical bit).
      • Choose your instance & cluster size as usual.
    3. Launch a fresh job (don’t resume an old snapshot, those can be version-mismatched).

    If you’re doing this via CLI, it should look roughly like:

     

    dx run dxjupyterlab_spark_cluster \  -i feature=HAIL-VEP \  -i cluster_size=1 \  -i duration=240 \  --instance-type mem3_ssd1_v2_x16 \  --name hail_vep_session \  --priority high \  -y

    2. Initialise Spark + Hail the “Spark cluster” way

    In the notebook on that Spark cluster:

     

    import pyspark import hail as hl sc = pyspark.SparkContext.getOrCreate() spark = pyspark.sql.SparkSession(sc) hl.init(sc=sc)  # Let Hail bind to the existing Spark cluster 

    The docs explicitly recommend this pattern on DXJupyterLab Spark Cluster. [documentation.dnanexus.com]

    Your existing hl.init() without sc will often work, but using the provided Spark cluster context is safer and closer to what DNAnexus expect.

    3. Sanity-check that VEP is actually installed

    In a new notebook cell (same JupyterLab session), run:

     

    !ls /cluster/vep | head !docker images | grep dxjupyterlab-vep || echo "No VEP image found" 

    On a correct HAIL-VEP cluster, you should see:

    • Files in /cluster/vep (cache, plugins, fasta, etc.), and
    • A line with dnanexus/dxjupyterlab-vep in the docker images output.

    If you instead see “No VEP image found” or /cluster/vep is missing/empty, then the bootstrap install_vep step didn’t run → most likely the feature wasn’t set to HAIL-VEP or there was a platform-side bootstrap failure.

    4. Then run your usual Hail+VEP code

    Assuming you have the standard config in your project:

     

    mt = ...  # your MatrixTable mt_vep = hl.vep(mt, "file:///mnt/project/vep-GRCh38.json")

    and that vep-GRCh38.json looks like the one from the docs (with dnanexus/dxjupyterlab-vep and /cluster/vep paths). [documentation.dnanexus.com]

    If it still fails

    If you’ve:

    • Launched DXJupyterLab Spark Cluster with feature=HAIL-VEP,
    • Verified that /cluster/vep exists and docker images | grep dxjupyterlab-vep shows an image,
    • But Hail still throws the same error,

    then that’s likely a platform regression rather than anything in your notebook.

    In that case I’d:

    1. Grab the Spark cluster job ID (the app job, not the Hail job).
    2. Open a support ticket via UKB RAP with:
      • The Hail error,
      • The job ID,
      • The output of docker images and ls /cluster/vep.
    3. Optionally link them to your community post so they can tie it together.

    If you like, I can also draft a short answer you can paste straight into that community thread, phrased for non-RAP-nerds.

     

    0
  • Comment author
    Dr. Mc. Ninja

    Short version (please let me know if it works):


     

    Short Answer / Fix

    This error happens because the compute environment you're using does not have the VEP Docker image installed, so Hail tries to run:

    docker run dnanexus/dxjupyterlab-vep
    

    and Docker then tries to pull it from Docker Hub, where it does not exist. Hence the 125 exit code.

    On DNAnexus/UKB RAP, the VEP image is not pulled from the internet — it is pre-loaded by the DXJupyterLab Spark Cluster app only when launched with the feature HAIL-VEP.

    If you start a normal JupyterLab session or a Spark cluster with feature=HAIL, you get Hail but no VEP.

    Fix (works every time)

    Launch your notebook on:

    Apps → DXJupyterLab Spark Cluster

    and set:

    feature = HAIL-VEP
    

    This ensures the cluster bootstrap runs install_vep, loads the private VEP Docker image, and populates /cluster/vep. Then Hail’s VEP wrapper can find:

    • the image: dnanexus/dxjupyterlab-vep:latest, and
    • the cache: /cluster/vep

    After launching the correct cluster, this should show the image is present:

    docker images | grep dxjupyterlab-vep
    

    Why it suddenly broke

    If your environment changed (e.g., new compute instance, default “HAIL” mode, or plain JupyterLab), VEP simply isn’t installed anymore — and Hail falls back to trying to pull a non-public Docker image.

    TL;DR

    Your code is fine — the environment wasn’t provisioning VEP.
    Launch DXJupyterLab Spark Cluster with HAIL-VEP, and VEP will work again.

    0

Please sign in to leave a comment.