6  Multivariate analyses

First load the required packages and data. Note that from the univariate analyses the analysis data set for the lab parameters has been updated to include transformed variables. Therefore, we load the second iteration of the data data/ADLB_02.rds.

6.1 V1: Association with structural variables

Attached the required structural variables to the lab data.

A scatterplot of each predictor with age, with different panels for males and females have been constructed. Associated Spearman correlation coefficients have been computed.

The remaining predictors are reported in appendix Section F.1.

6.1.1 Key predictors

6.1.2 Predictors of medium importance

6.2 V2: Correlation coefficients between all predictors

Calculate correlation matrix using Spearman correlation coefficient.

The Spearman correlation coefficients are depicted in a quadratic heat map:

6.2.1 VE1: Comparing nonparametric and parametric predictor correlation

Plot the matrix of differences between Spearman and Pearson pairwise correlation coefficients.

Plot the matrix of differences between Spearman and Pearson pairwise correlation coefficients but suppress differences less then 0.1 in absolute value.

Predictor pairs for which Spearman and Pearson correlation coefficients differ by more than 0.1 correlation units in absolute value will be depicted in scatterplots. First report the table of predictor pairs where the difference (column r) is greater than 0.1.

x y r
PLT LYMR 0.12
MONO MONOR 0.18
BASO EOS_T 0.12
BUN AGE 0.13
EOSR LYMR 0.18
EOSR EOS_T 0.21
EOSR LYM_T 0.14
LYMR MONOR 0.11
LYMR EOS_T 0.17
LYMR LYM_T 0.22
MONOR LYM_T 0.13
NEU NEUR 0.11

Then plot the scatter plots.

6.2.2 VE2: Variable clustering

A variable clustering analysis has been performed to evaluate which predictors are closely associated. The dendrogram groups predictors by their correlation.

This can also be displayed as a network plot:

In the following scatterplots we show predictor pairs with Spearman correlation coefficients greater than 0.8:

6.2.3 VE3: Redundancy

Variance inflation factors (VIF) will be computed between the candidate predictors. This will be done for the three possible candidate models, and using all complete cases in the respective candidate predictor sets. Since \(VIF = (1-R^2)^{-1}\), we also report the multiple R-squared values.

Redundancy was further explored by computing parametric additive models for each predictor in the key predictor model and the extended predictor model. VIFs and multiple \(R^2\) are reported from those models, again for the three predictor sets.

Note, the all predictor model is reported in appendix Section F.2.

6.2.3.1 VIF for key predictor model

The available sample size is 13793 (1.44 %).

Parameter code Variance inflation factor Multiple R-squared
SEX 1.1 0.05
PLT 1.2 0.15
BUN 2.4 0.58
NEU 5.4 0.82
AGE 1.1 0.09
CREA_T 2.3 0.56
WBC_T 5.6 0.82

6.2.3.2 VIF for model with key predictors and predictors of medium importance

The available sample size is 9389 (0.98 %).

Parameter code Variance inflation factor Multiple R-squared
SEX 1.1 0.09
PLT 1.4 0.30
FIB 2.3 0.56
POTASS 1.1 0.10
BUN 2.6 0.61
CRP 2.2 0.54
NEU 5.9 0.83
AGE 1.2 0.15
ALAT_T 3.3 0.69
ASAT_T 3.2 0.69
CREA_T 2.4 0.58
GGT_T 1.4 0.30
WBC_T 6.0 0.83

6.2.3.3 VIF for all predictor model

See appendix Section F.2.

6.2.3.4 Redundancy by parametric additive model: key predictor model

The available sample size is 13793 (1.44 %).

Parameter code Variance inflation factor Multiple R-squared
SEX 1.1 0.11
PLT 1.2 0.16
BUN 2.6 0.61
NEU 11.5 0.91
AGE 1.3 0.21
CREA_T 2.4 0.58
WBC_T 13.7 0.93

6.2.3.5 Redundancy by parametric additive model: key predictors and predictors of medium importance

The available sample size is 9389 (0.98 %).

Parameter code Variance inflation factor Multiple R-squared
SEX 1.2 0.15
PLT 1.5 0.32
FIB 2.4 0.58
POTASS 1.1 0.11
BUN 2.8 0.65
CRP 2.3 0.56
NEU 14.1 0.93
AGE 1.4 0.29
ALAT_T 3.4 0.70
ASAT_T 3.5 0.71
CREA_T 2.5 0.60
GGT_T 1.5 0.35
WBC_T 16.4 0.94