Thank the DNAnexus team for introducing this forum to us. We would like to share the pipeline we have created to run gene-based tests in the UK Biobank WES data using SAIGE-GENE+ on DNAnexus.
https://saigegit.github.io/SAIGE-doc/docs/UK_Biobank_WES_analysis.html
Benchmark on the cost for jobs is also included at the end of the pipeline.
We would appreciate any feedback.
It is possible that the error is because of running out of memory. Are you analyzing all UKBB samples in this step? You may want to run the pipeline by ancestry. This is because when you try to estimate the sparse GRM using all samples, samples from the same ancestry group will be treated as related samples. For example, all AFR samples will be "related" and have non-zero related coefficients in the matrix. It will take very large memory to store the vector.
I pruned the files further so that there are no AFR samples, only white British, but still not enough memory. I have to use singularity on an academic mainframe rather than docker, but I don't know whether that is the problem.
Thank you for sharing the output information! It really helps. The output information looks correct to me. I'm not quite familiar with the BSUB commands. Does -W indicates the running time limit here? https://www.ibm.com/docs/en/spectrum-lsf/10.1.0?topic=o-w-1
If so, I guess the job was killed because it ran longer than 6 hours.
Here are the computation cost from my jobs for UKBB EUR samples.
The plink file has 459797 individuals with 241660 markers.
When the sample relatedness coefficient cutoff is 0.05 (--relatednessCutoff=0.05)
A job in the job tree named "null" running function "main" of "null" failed because of AppInternalError: Could not find dx: WES: docker_images saige_1.0.9.tar.gz in any of Vector(DxProject(project-GJB3GpQJBJkYK0j74jp4vJZ9))
even though I have folder and file
WES: docker_images saige_1.0.9.tar.gz
[lehres01@lc02a28 UKBGWAS]$ dx run ${workflow_id} \
The saige file I was using was SAIGE_1.0.9.tar.gx. I renamed with lower case saige_1.0.9.tar.gz and get this error
A job in the job tree named "null" running function "main" of "null" failed because of AppInternalError: Error running command Command: docker load --input tmp docker-tarballs958114435221618809 saige_1.0.9.tar.gz Return Code: 1 STDOUT: Some() STDERR: Some(open var lib docker tmp docker-import-159484564 SAIGE json: no such file or directory )
A job in the job tree named "null" running function "main" of "null" failed because of AppInternalError: job script function run_command exited with permanent fail code 1 Loading required package: optparse Warning message: In getopt(spec = spec, opt = args) : long flag invNormalize given a bad argument Error in fitNULLGLMM(plinkFile = opt plinkFile, bedFile = opt bedFile, : ERROR! column for does not exist in the phenoFile In addition: Warning message: In data.table:::fread(phenoFile, header = T, stringsAsFactors = FALSE, : Column name (colClasses[[1]][1]) not found Execution halted
A job in the job tree named "null" running function "main" of "null" failed because of AppInternalError: job script function run_command exited with permanent fail code 1 Loading required package: optparse Error in seq.default(1, nrow(mmat_nomissing), by = 1) : wrong sign in by argument Calls: fitNULLGLMM -> seq -> seq.default Execution halted
Comments
16 comments
Do I use ukb22418 files to create the pruned plink file; if not, which do I use
I uploaded
saige_1.0.9.tar.gz
dxCompiler-2.10.1.jar
I created folder workflow
I ran your command in Swiss Army Knife
java -jar dxCompiler-2.10.5.jar compile saige_null_sGRM_vr_withinfo.wdl -project $57245 -folder /workflows/ -f
I got this error
I modified command line
java -jar dxCompiler-2.10.5.jar compile saige_null_sGRM_vr_withinfo.wdl -project $57245 -folder /workflows/ -f
and got this
PS D:\> java -jar dxCompiler-2.10.5.jar compile saige_null_sGRM_vr_withinfo.wdl -project $57245 -folder /workflows/ -f
?[31m[error] Error parsing command line options
dx.core.CliUtils$OptionParseException: Expected option project to have 1 value, found Vector()
at dx.core.CliUtils$SingleValueOptionSpec.parseValues(CliUtils.scala:128)
at dx.core.CliUtils$OptionSpec.parse(CliUtils.scala:96)
at dx.core.CliUtils$OptionSpec.parse$(CliUtils.scala:92)
at dx.core.CliUtils$SingleValueOptionSpec.parse(CliUtils.scala:121)
at dx.core.CliUtils$.createOpt$1(CliUtils.scala:259)
at dx.core.CliUtils$.$anonfun$parseCommandLine$3(CliUtils.scala:265)
at dx.core.CliUtils$$$Lambda$13.0000000011F76650.apply(Unknown Source)
at scala.collection.IterableOnceOps.foldLeft(IterableOnce.scala:646)
at scala.collection.IterableOnceOps.foldLeft$(IterableOnce.scala:642)
at scala.collection.AbstractIterable.foldLeft(Iterable.scala:926)
at dx.core.CliUtils$.parseCommandLine(CliUtils.scala:263)
at dxCompiler.Main$.compile(Main.scala:339)
at dxCompiler.Main$.dispatchCommand(Main.scala:791)
at dxCompiler.Main$.main(Main.scala:921)
at dxCompiler.MainApp$.delayedEndpoint$dxCompiler$MainApp$1(Main.scala:926)
at dxCompiler.MainApp$delayedInit$body.apply(Main.scala:925)
at scala.Function0.apply$mcV$sp(Function0.scala:39)
at scala.Function0.apply$mcV$sp$(Function0.scala:39)
at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:17)
at scala.App.$anonfun$main$1(App.scala:76)
at scala.App.$anonfun$main$1$adapted(App.scala:76)
at scala.App$$Lambda$1.0000000011D6A960.apply(Unknown Source)
at scala.collection.IterableOnceOps.foreach(IterableOnce.scala:563)
at scala.collection.IterableOnceOps.foreach$(IterableOnce.scala:561)
at scala.collection.AbstractIterable.foreach(Iterable.scala:926)
at scala.App.main(App.scala:76)
at scala.App.main$(App.scala:74)
at dxCompiler.MainApp$.main(Main.scala:925)
at dxCompiler.MainApp.main(Main.scala)
?[0m
Ran command line on DNAnexus CLI running on mainframe with linux, worked fine! Just will not work on DNAnexus CLI running on win10 PC.
I am getting this error. What do I do?
72 threads are set to be used
sparse GRM will be created
Markers in the Plink file with MAF < 0.01 will be removed before constructing GRM
Markers in the Plink file with missing rate > 0.15 will be removed before constructing GRM
write sample IDs for the sparse GRM to sparsegrm_relatednessCutoff_0.05_5000_randomMarkersUsed.sparseGRM.mtx.sampleIDs.txt
isDiagofKinSetAsOne FALSE
nbyte: 122095
nbyte: 122095
reserve: 26979287040
M: 220966, N: 488377
setgeno mark1
setgeno mark2
116440 markers with MAF >= 0.01 and missing rate <= 0.15
time: 860248
116440 markers have MAF >= 0.01
5000 genetic markers are randomly selected to decide which samples are related
Start detecting related samples for the sparse GRM
tb-ta
user system elapsed
0 0 0
Start creating sparse GRM
freq: 0.042546 invStd: 3.50346 SNPIdx: 24387
stdGenoMultiMarkersMat.n_rows: 5000
stdGenoMultiMarkersMat.n_cols: 488377
i,j 0,0
i,j 0,1
Ntotal: 488377
2147483647
9223372036854775807
9223372036854775807
totalCombination: 119255802875
a 23247
b 2
Error in findIndiceRelatedSample() : std::bad_alloc
Calls: createSparseGRM -> createSparseKinParallel -> findIndiceRelatedSample
Execution halted
Hi Steven,
It is possible that the error is because of running out of memory. Are you analyzing all UKBB samples in this step? You may want to run the pipeline by ancestry. This is because when you try to estimate the sparse GRM using all samples, samples from the same ancestry group will be treated as related samples. For example, all AFR samples will be "related" and have non-zero related coefficients in the matrix. It will take very large memory to store the vector.
Thanks,
Wei
Dear Wei,
I used this code on the ukb 22418 hard coded files
plink --bfile ukb_cal_chr1_22_v2_merged --indep-pairwise 50 5 0.05 --out tmp1
plink --bfile ukb_cal_chr1_22_v2_merged --extract tmp1.prune.in --make-bed --out pruned
and got 220, 966 markers, quite similar to your 241,660. The pruned bed file is 25.13 GiB.
Dear Wei,
I pruned the files further so that there are no AFR samples, only white British, but still not enough memory. I have to use singularity on an academic mainframe rather than docker, but I don't know whether that is the problem.
Dear Wei,
Here is the output before the job failed for lack of memory, Can you see anything wrong?
[lehres01@li03c03 UKBGWAS]$
[lehres01@li03c03 UKBGWAS]$ #!/bin/bash
[lehres01@li03c03 UKBGWAS]$ #BSUB -n 1
[lehres01@li03c03 UKBGWAS]$ #BSUB -R affinity[core(32)]
[lehres01@li03c03 UKBGWAS]$ #BSUB -P acc_UKBGWAS
[lehres01@li03c03 UKBGWAS]$ #BSUB -R rusage[mem=500GB]
[lehres01@li03c03 UKBGWAS]$ #BSUB -W 6:00
[lehres01@li03c03 UKBGWAS]$ #BSUB -J saige
[lehres01@li03c03 UKBGWAS]$ #BSUB -q express
[lehres01@li03c03 UKBGWAS]$ #BSUB -oo saige.out
[lehres01@li03c03 UKBGWAS]$ singularity run saige_1.1.6.2.sif createSparseGRM.R --plinkFile=prunedfn --outputPrefix=sparsegrm --numRandomMarkerforSparseKin=5000 --relatednessCutoff=0.05
Loading required package: optparse
R version 3.6.3 (2020-02-29)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.4 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3
locale:
[1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8
[4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8
[7] LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] optparse_1.7.3 SAIGE_1.1.6.2
loaded via a namespace (and not attached):
[1] compiler_3.6.3 Matrix_1.5-1 Rcpp_1.0.7 getopt_1.20.3
[5] grid_3.6.3 data.table_1.12.8 RcppParallel_5.1.5 lattice_0.20-40
$plinkFile
[1] "prunedfn"
$bedFile
[1] ""
$bimFile
[1] ""
$famFile
[1] ""
$nThreads
[1] 16
$memoryChunk
[1] 2
$outputPrefix
[1] "sparsegrm"
$numRandomMarkerforSparseKin
[1] 5000
$relatednessCutoff
[1] 0.05
$isDiagofKinSetAsOne
[1] FALSE
$minMAFforGRM
[1] 0.01
$maxMissingRateforGRM
[1] 0.15
$help
[1] FALSE
16 threads are set to be used
sparse GRM will be created
Markers in the Plink file with MAF < 0.01 will be removed before constructing GRM
Markers in the Plink file with missing rate > 0.15 will be removed before constructing GRM
write sample IDs for the sparse GRM to sparsegrm_relatednessCutoff_0.05_5000_randomMarkersUsed.sparseGRM.mtx.sampleIDs.txt
isDiagofKinSetAsOne FALSE
nbyte: 115038
nbyte: 115038
reserve: 25419929600
M: 220966, N: 460149
setgeno mark1
setgeno mark2
115697 markers with MAF >= 0.01 and missing rate <= 0.15
time: 798968
115697 markers have MAF >= 0.01
5000 genetic markers are randomly selected to decide which samples are related
Start detecting related samples for the sparse GRM
tb-ta
user system elapsed
0.001 0.000 0.000
Start creating sparse GRM
freq: 0.46852 invStd: 1.41702 SNPIdx: 24387
stdGenoMultiMarkersMat.n_rows: 5000
stdGenoMultiMarkersMat.n_cols: 460149
i,j 0,0
i,j 0,1
Ntotal: 460149
2147483647
9223372036854775807
9223372036854775807
totalCombination: 105868321025
a 79703
b 2
Killed
[lehres01@li03c03 UKBGWAS]$
Dear Steven,
Thank you for sharing the output information! It really helps. The output information looks correct to me. I'm not quite familiar with the BSUB commands. Does -W indicates the running time limit here? https://www.ibm.com/docs/en/spectrum-lsf/10.1.0?topic=o-w-1
If so, I guess the job was killed because it ran longer than 6 hours.
Here are the computation cost from my jobs for UKBB EUR samples.
The plink file has 459797 individuals with 241660 markers.
When the sample relatedness coefficient cutoff is 0.05 (--relatednessCutoff=0.05)
jobs command is
Rscript createSparseGRM.R --plinkFile=./ukb.EUR.for_grm.pruned.plink --nThreads=72 --outputPrefix=./ukb.EUR --numRandomMarkerforSparseKin=5000 --relatednessCutoff=0.05 --memoryChunk=1
Time cost: 72 CPUs, 7 hours 40 mins. Memory cost: 34.6Gb
When the sample relatedness coefficient cutoff is 0.125 (-relatednessCutoff=0.125)
job command is
Rscript createSparseGRM.R --plinkFile=./ukb.EUR.for_grm.pruned.plink --nThreads=72 --outputPrefix=./ukb.EUR --numRandomMarkerforSparseKin=2000 --relatednessCutoff=0.125 --memoryChunk=1
Time cost: 72 CPUs, 2 hours 35 mins. Memory cost: 27.8Gb
Below you may find the first several lines from the log information.
Loading required package: optparse
R version 4.2.0 (2022-04-22)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.5 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/openblas/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.2.20.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] optparse_1.6.6 SAIGE_1.0.7
loaded via a namespace (and not attached):
[1] compiler_4.2.0 Matrix_1.3-2 Rcpp_1.0.7 getopt_1.20.3
[5] grid_4.2.0 data.table_1.14.0 RcppParallel_5.0.2 lattice_0.20-45
$plinkFile
[1] "./ukb.EUR.for_grm.pruned.plink"
$bedFile
[1] ""
$bimFile
[1] ""
$famFile
[1] ""
$nThreads
[1] 72
$memoryChunk
[1] 1
$outputPrefix
[1] "./ukb.EUR"
$numRandomMarkerforSparseKin
[1] 5000
$relatednessCutoff
[1] 0.05
$isDiagofKinSetAsOne
[1] FALSE
$minMAFforGRM
[1] 0.01
$maxMissingRateforGRM
[1] 0.15
$help
[1] FALSE
72 threads are set to be used
sparse GRM will be created
Markers in the Plink file with MAF < 0.01 will be removed before constructing GRM
Markers in the Plink file with missing rate > 0.15 will be removed before constructing GRM
write sample IDs for the sparse GRM to ./ukb.EUR_relatednessCutoff_0.05_5000_randomMarkersUsed.sparseGRM.mtx.sampleIDs.txt
isDiagofKinSetAsOne FALSE
nbyte: 114950
nbyte: 114950
reserve: 27779299328
M: 241660, N: 459797
setgeno mark1
setgeno mark2
212235 markers with MAF >= 0.01 and missing rate <= 0.15 are used for GRM.
time: 1.18461e+06
212235 markers have MAF >= 0.01
5000 genetic markers are randomly selected to decide which samples are related
Start detecting related samples for the sparse GRM
tb-ta
user system elapsed
0.001 0.000 0.001
Start creating sparse GRM
freq: 0.0705398 invStd: 2.76155 SNPIdx: 24387
stdGenoMultiMarkersMat.n_rows: 5000
stdGenoMultiMarkersMat.n_cols: 459797
i,j 0,0
i,j 0,1
Ntotal: 459797
2147483647
9223372036854775807
9223372036854775807
totalCombination: 105706410705
a 80407
b 2
tp1 - tp0: 1024601 6106.11 14326.74 0 0
ni: 72679856
Thanks,
Wei
I want to run without covariates and used this code
instance_type="mem1_ssd1_v2_x4"
traitType=binary
invNormalize=FALSE
phenoCol=value
sampleIDCol=userId
pheno_file=phecode-250.2-both_sexes_wowithdrawl_with450kWES.tsv
PLINK_for_vr=ukb.EUR.for_grm.pruned.plink.forvr
sparseGRMfile=ukb.EUR_relatednessCutoff_0.05_5000_randomMarkersUsed.sparseGRM.mtx
sparseGRM_sample_file=ukb.EUR_relatednessCutoff_0.05_5000_randomMarkersUsed.sparseGRM.mtx.sampleIDs.txt
jobname=phecode-250.2_step1
workflow_id=workflow-GK3jBq8JBJkjj9676vZF2qkJ
dx run ${workflow_id} \
-istage-common.phenofile=phenonycho.pheno \
-istage-common.bedfile=PLINKforvr.bed \
-istage-common.bimfile=PLINKforvr.bim \
-istage-common.famfile=PLINKforvr.fam \
-istage-common.spGRMfile=sparseGRM.mtx \
-istage-common.spGRMSamplefile=sparseGRMsample.txt \
-istage-common.output_prefix=WEIOUT \
-istage-common.phenoCol=phenoCol \
-istage-common.traitType=traitType \
-istage-common.sampleIDCol=sampleIDCol \
-istage-common.invNormalize=invNormalize \
--folder WES:/SAIGE_GENE/step1_output/ \
--yes \
--name=WEIZHOU \
--instance-type=instance_type
I get this output
[lehres01@lc02a29 UKBGWAS]$ dx run ${workflow_id} \
> -istage-common.phenofile=phenonycho.pheno \
> -istage-common.bedfile=PLINKforvr.bed \
> -istage-common.bimfile=PLINKforvr.bim \
> -istage-common.famfile=PLINKforvr.fam \
> -istage-common.spGRMfile=sparseGRM.mtx \
> -istage-common.spGRMSamplefile=sparseGRMsample.txt \
> -istage-common.output_prefix=STEVEOUT \
> -istage-common.phenoCol=phenoCol \
> -istage-common.traitType=traitType \
> -istage-common.sampleIDCol=sampleIDCol \
> -istage-common.invNormalize=invNormalize \
> --folder WES:/SAIGE_GENE/step1_output/ \
> --yes \
> --name=STEVE \
> --instance-type=instance_type
Input: stage-common.covariatesList (stage-common.covariatesList)
Class: string
Enter string value ('?' for more options)
stage-common.covariatesList:
What do I do?
When I started you were at SAIGE_1.0.9.tar.gz. You're now at saige:1.1.6.3. Do I need to create new workflow files? Any other new files?
I get this error
A job in the job tree named "null" running function "main" of "null" failed because of AppInternalError: Could not find dx: WES: docker_images saige_1.0.9.tar.gz in any of Vector(DxProject(project-GJB3GpQJBJkYK0j74jp4vJZ9))
even though I have folder and file
WES: docker_images saige_1.0.9.tar.gz
[lehres01@lc02a28 UKBGWAS]$ dx run ${workflow_id} \
> -istage-common.phenofile=phenonycho.pheno \
> -istage-common.bedfile=PLINKforvr.bed \
> -istage-common.bimfile=PLINKforvr.bim \
> -istage-common.famfile=PLINKforvr.fam \
> -istage-common.spGRMfile=sparseGRM.mtx \
> -istage-common.spGRMSamplefile=sparseGRMsample.txt \
> -istage-common.output_prefix=WEIOUT \
> -istage-common.phenoCol=phenoCol \
> -istage-common.traitType=traitType \
> -istage-common.covariatesList=covariatesList \
>
Input: stage-common.invNormalize (stage-common.invNormalize)
Class: string
Enter string value ('?' for more options)
stage-common.invNormalize: --folder WES:/SAIGE_GENE/step1_output/ \
Input: stage-common.qCovarColList (stage-common.qCovarColList)
Class: string
Enter string value ('?' for more options)
stage-common.qCovarColList: --yes \
Input: stage-common.sampleIDCol (stage-common.sampleIDCol)
Class: string
Enter string value ('?' for more options)
stage-common.sampleIDCol: --name=WEIZHOU \
Using input JSON:
{
"stage-common.output_prefix": "WEIOUT",
"stage-common.phenoCol": "phenoCol",
"stage-common.traitType": "traitType",
"stage-common.covariatesList": "covariatesList",
"stage-common.phenofile": {
"$dnanexus_link": {
"project": "project-GJB3GpQJBJkYK0j74jp4vJZ9",
"id": "file-GK3jQG0JBJkZFqq56Yv0Xyz7"
}
},
"stage-common.bedfile": {
"$dnanexus_link": {
"project": "project-GJB3GpQJBJkYK0j74jp4vJZ9",
"id": "file-GK3jZb0JBJkX74yp6XBqXxk3"
}
},
"stage-common.bimfile": {
"$dnanexus_link": {
"project": "project-GJB3GpQJBJkYK0j74jp4vJZ9",
"id": "file-GK3jbKjJBJkQvV5G6xBQP0V5"
}
},
"stage-common.famfile": {
"$dnanexus_link": {
"project": "project-GJB3GpQJBJkYK0j74jp4vJZ9",
"id": "file-GK3jbb0JBJkk3YGQ78010Z88"
}
},
"stage-common.spGRMfile": {
"$dnanexus_link": {
"project": "project-GJB3GpQJBJkYK0j74jp4vJZ9",
"id": "file-GK3jk40JBJkQ7J026Xbj1X53"
}
},
"stage-common.spGRMSamplefile": {
"$dnanexus_link": {
"project": "project-GJB3GpQJBJkYK0j74jp4vJZ9",
"id": "file-GK3jkB0JBJkxp2yK6g73Q1pb"
}
},
"stage-common.invNormalize": " --folder WES:/SAIGE_GENE/step1_output/ \\",
"stage-common.qCovarColList": " --yes \\",
"stage-common.sampleIDCol": " --name=WEIZHOU \\"
}
Confirm running the executable with this input [Y/n]: --instance-type=instance_typeY
Error: unrecognized response
Confirm running the executable with this input [Y/n]: Y
Calling workflow-GK40PF0JBJkxF1kQ6Vz124z4 with output destination
project-GJB3GpQJBJkYK0j74jp4vJZ9:/
Analysis ID: analysis-GK40X38JBJkyxfJZ770z3VYg
The saige file I was using was SAIGE_1.0.9.tar.gx. I renamed with lower case saige_1.0.9.tar.gz and get this error
A job in the job tree named "null" running function "main" of "null" failed because of AppInternalError: Error running command Command: docker load --input tmp docker-tarballs958114435221618809 saige_1.0.9.tar.gz Return Code: 1 STDOUT: Some() STDERR: Some(open var lib docker tmp docker-import-159484564 SAIGE json: no such file or directory )
I get this error
A job in the job tree named "null" running function "main" of "null" failed because of AppInternalError: job script function run_command exited with permanent fail code 1 Loading required package: optparse Warning message: In getopt(spec = spec, opt = args) : long flag invNormalize given a bad argument Error in fitNULLGLMM(plinkFile = opt plinkFile, bedFile = opt bedFile, : ERROR! column for does not exist in the phenoFile In addition: Warning message: In data.table:::fread(phenoFile, header = T, stringsAsFactors = FALSE, : Column name (colClasses[[1]][1]) not found Execution halted
with this code
instance_type="mem1_ssd1_v2_x4"
traitType=binary
invNormalize=FALSE
phenoCol=value
covariatesList=sex,age
sampleIDCol=userId
pheno_file=phecode-250.2-both_sexes_wowithdrawl_with450kWES.tsv
PLINK_for_vr=ukb.EUR.for_grm.pruned.plink.forvr
sparseGRMfile=ukb.EUR_relatednessCutoff_0.05_5000_randomMarkersUsed.sparseGRM.mtx
sparseGRM_sample_file=ukb.EUR_relatednessCutoff_0.05_5000_randomMarkersUsed.sparseGRM.mtx.sampleIDs.txt
jobname=phecode-250.2_step1
workflow_id=workflow-GK40kb8JBJkjggjy701P8ZxF
dx run ${workflow_id} \
-istage-common.phenofile=phenonycol.pheno \
-istage-common.bedfile=PLINKforvr.bed \
-istage-common.bimfile=PLINKforvr.bim \
-istage-common.famfile=PLINKforvr.fam \
-istage-common.spGRMfile=sparseGRM.mtx \
-istage-common.spGRMSamplefile=sparseGRMsample.txt \
-istage-common.output_prefix=WEIOUT \
-istage-common.phenoCol=phenoCol \
-istage-common.traitType=traitType \
-istage-common.covariatesList=covariatesList \
can anybody help me
Now I get this error
A job in the job tree named "null" running function "main" of "null" failed because of AppInternalError: job script function run_command exited with permanent fail code 1 Loading required package: optparse Error in seq.default(1, nrow(mmat_nomissing), by = 1) : wrong sign in by argument Calls: fitNULLGLMM -> seq -> seq.default Execution halted
for this code
instance_type="mem1_ssd1_v2_x4"
traitType=binary
invNormalize=FALSE
phenoCol=phenoCol
sampleIDCol=IID
pheno_file=phenonycol.pheno
PLINK_for_vr=PLINKforvr
sparseGRMfile=sparseGRM.mtx
sparseGRM_sample_file=sparseGRMsample.txt
jobname=phecode-250.2_step1
workflow_id=workflow-GK40kb8JBJkjggjy701P8ZxF
dx run ${workflow_id} \
-istage-common.phenofile=${pheno_file} \
-istage-common.bedfile=${PLINK_for_vr}.bed \
-istage-common.bimfile=${PLINK_for_vr}.bim \
-istage-common.famfile=${PLINK_for_vr}.fam \
-istage-common.spGRMfile=${sparseGRMfile} \
-istage-common.spGRMSamplefile=${sparseGRM_sample_file} \
-istage-common.output_prefix=${outputPrefix} \
-istage-common.phenoCol=${phenoCol} \
-istage-common.traitType=${traitType} \
-istage-common.covariatesList=${covariatesList} \
-istage-common.qCovarColList=${qCovarColList} \
-istage-common.sampleIDCol=${sampleIDCol} \
-istage-common.invNormalize=${invNormalize} \
--folder WES:\
--yes \
--name=${jobname} \
--instance-type=${instance_type}
can anybody help me?
Please sign in to leave a comment.