Proteomics data values from dx extract (NPX value clarification for differential analysis)
Hello,
Apologies i'm fairly new to UKBiobank platform/analysis
- I've extracted 2924 proteins measured for 53013 patients with available proteomics data in UKBB using the script provided here https://github.com/dnanexus/UKB_RAP/tree/main/proteomics (0_extract_phenotype_protein_data.ipynb)
-
Instance 0 : Initial assessment visit (2006-2010) at which participants were recruited and consent given 53,014 participants, 53,014 items https://biobank.ndph.ox.ac.uk/showcase/field.cgi?id=30900
-
There are 2923 proteins for 53013 ranging from -9.66 to 13.44 npx and centred around 0.
It really looks like the npx-raw data has been transformed to z-scale? As far as i understand, differential expression limma analysis should be done on raw NPX values (untransformed) as shown in this UKB_RAP/proteomics/protein_DE_analysis/2_differential_expression_analysis.ipynb at main · dnanexus/UKB_RAP but it looks like the values provided on 53013 x 2923 protiens is npx-z-scaled values?
- Please can we confirm if these values are normalized raw data or z-scaled? If so, is it possible to access the untransformed data?
- Is differential expression analysis (e.g. using limma) in the UKB_RAP repo based on raw NPX values, or are z-scaled values used?
Any clarification or advice would be greatly appreciated.
Kind regards,
John
Comments
1 comment
I have the same question. There are no negative values in the provided examples. I am unsure if the resource https://biobank.ndph.ox.ac.uk/showcase/refer.cgi?id=4656 can help me understand normalization. Anyway, I don't know how to manage the limit of detection.
Please sign in to leave a comment.