Polite inquiry about how to efficiently analyze the huge raw data of accelerometry on RAP?

27 October 2023 00:00
3 comments

Dear professors, I am trying to use accelerometer tool (from https://github.com/OxWearables/biobankAccelerometerAnalysis) and batch process all cwa files of the accelerometry data in RAP, for example all cwa files in the Bulk/Activity/Raw/10/ directory, in order to obtain all output csv files of timeseries data with 30-second intervals from the analysis. However, I am confused about how to apply this accelerator tool with the huge cwa files inside the Bulk/Activity/Raw folder. I would prefer directly performing relevant analysis on the jupyterlab or R in the RAP, as its data volume is really huge. Looking forward to your help, thank you so much!

Comments

3 comments

Rachael W UKB Community team Data Analyst
- 28 October 2023 22:38
The tool can be used within a single-node JupyterLab. For more information on JupyterLab, see https://dnanexus.gitbook.io/uk-biobank-rap/working-on-the-research-analysis-platform/using-jupyterlab-on-the-research-analysis-platform

I think it is necessary to install java in the JupyterLab environment, and then to install the accelerometer tool.

It is necessary to copy the cwa files from your main project storage into the JupyterLab storage. After processing, it is necessary to copy the results from the JupyterLab storage into your main project storage.

One way to do this is to start a JupyterLab instance, create a Python notebook, and use bash commands by prefixing with "!". See the example code and output in the images below.

It might be better to open a Terminal instead of a Python notebook.
It might be better to use a docker image.
It might be possible to control your workflow using the CLI instead of a notebook.

If anyone knows a better way, please post another answer.

0
Rachael W UKB Community team Data Analyst
- 28 October 2023 23:20
The images are a bit hard to read. Here is a copy of the commands:

# Create an isolated environment
!mkdir test_baa3/ ; cd test_baa3/
!python -m venv baa3
!source baa3/bin/activate

# install java
!pip install install-jdk

#
import jdk

#
from jdk.enums import OperatingSystem, Architecture

jdk.install('11', jre=True, operating_system=OperatingSystem.LINUX)

#
import os
jdk_version = 'jdk-11.0.21+9-jre'
os.environ['JAVA_HOME'] = '/home/dnanexus/.jre/jdk-11.0.21+9-jre'
os.environ['PATH'] = f"{os.environ.get('PATH')}:{os.environ.get('JAVA_HOME')}/bin"

#
print(jdk.OS)

#
print(jdk.ARCH)

#
!export JAVA_HOME='/home/dnanexus/.jre/jdk-11.0.21+9-jre'

print(os.environ['JAVA_HOME'])

#
download_url = jdk.get_download_url('11', jre=True)
print(download_url)
# Obtains the platform dependent JRE download url

#
download_file = jdk.download('https://api.adoptium.net/v3/binary/latest/11/ga/linux/x64/jre/hotspot/normal/eclipse', version=jdk_version)
print(download_file)

#
!java --version

#
#Install
!pip install accelerometer

# test
!wget -P data/ http://gas.ndph.ox.ac.uk/aidend/accModels/sample.cwa.gz # download a sample file
!accProcess data/sample.cwa.gz
!accPlot data/sample-timeSeries.csv.gz

# copy one of the real UKB cwa files from project storage to instance storage
!dx download /Bulk/Activity/Raw/10/NNNNNNN_90001_4_0.cwa ##### change NNNNNNN to one of the files present

# process the real cwa file, put the result files in new folder accOut
!accProcess /opt/notebooks/10/NNNNNNN_90001_4_0.cwa --outputFolder accOut

#
!dx upload data -r

0
Former User of DNAx Community_75
- 29 October 2023 10:39
Dear Prof. Rachael W , thank you so much for the above sharing. Wishing you all the best!?

0

Please sign in to leave a comment.