What would be the cost of calculating a multilevel linear regression model with categorical exposure , continuous outcome, 15 covariates in the 500.000 particpant dataset including analytical weights and with missing values imputed by chained equations?
Hi I would like to do the analysis in stata, so code will mostly be stata. I plan to use indivual infomation age sex BMI, comorbidity data and some exposure data dereived from the genetic datasets (generation categorical , ordinal and continous exposures and covariantes.
It should say generating above, sorry. So an example might be a categorical exposure derived from functional callsification of SNPs, and bout 10-15 ordinal, categorical and continuous covariates on an continuous outcome
This is not a straightforward question to answer and it will depend on the context (see list of parameters to consider). Final running costs can only be estimated with confidence through a test run. Therefore I suggest you to run your analysis on 1000 participants and use it to estimate cost for the whole 500000 set.
You will need to take several parameters into consideration, for example:
But should these parameters not depend on the information I gave you? (softwere tool etc indicated above) Im not a bioinfomatiocian so I do not know how this translates into this parameters. How can I do a test run before I have access to the platform ? For access I need to write a grant proposal to get the meney . For the proposal I must be able to estimate the costs in advance.
Comments
5 comments
Hello,
can you please provide more details about your analysis? What data are planning to use? In what programming language is written your code?
Hi I would like to do the analysis in stata, so code will mostly be stata. I plan to use indivual infomation age sex BMI, comorbidity data and some exposure data dereived from the genetic datasets (generation categorical , ordinal and continous exposures and covariantes.
It should say generating above, sorry. So an example might be a categorical exposure derived from functional callsification of SNPs, and bout 10-15 ordinal, categorical and continuous covariates on an continuous outcome
This is not a straightforward question to answer and it will depend on the context (see list of parameters to consider). Final running costs can only be estimated with confidence through a test run. Therefore I suggest you to run your analysis on 1000 participants and use it to estimate cost for the whole 500000 set.
You will need to take several parameters into consideration, for example:
Please see a few links to documentation pages:
But should these parameters not depend on the information I gave you? (softwere tool etc indicated above) Im not a bioinfomatiocian so I do not know how this translates into this parameters. How can I do a test run before I have access to the platform ? For access I need to write a grant proposal to get the meney . For the proposal I must be able to estimate the costs in advance.
Please sign in to leave a comment.