Chapter 7 Univariate distribution checks
This section reports a series of univariate summary checks of the CRASH-2 dataset.
7.1 Data set overview
Using the Hmisc describe function, we provide an overview of the data set. The descriptive report also provides histograms of continuous variables. For ease of scanning the information, we group the report by measurement type.
7.1.1 Demographic variables
Demographic variables
2 Variables 20207 Observations
2 Variables 20207 Observations
age: Age years
n | missing | distinct | Info | Mean | Gmd | .05 | .10 | .25 | .50 | .75 | .90 | .95 |
---|---|---|---|---|---|---|---|---|---|---|---|---|
20203 | 4 | 84 | 0.999 | 34.56 | 15.55 | 18 | 19 | 24 | 30 | 43 | 55 | 64 |
sex: Sex
n | missing | distinct |
---|---|---|
20206 | 1 | 2 |
Value male female Frequency 16935 3271 Proportion 0.838 0.162
7.1.2 Physiological measurements
Physiological measurements
5 Variables 20207 Observations
5 Variables 20207 Observations
sbp: Systolic Blood Pressure mmHg
n | missing | distinct | Info | Mean | Gmd | .05 | .10 | .25 | .50 | .75 | .90 | .95 |
---|---|---|---|---|---|---|---|---|---|---|---|---|
19887 | 320 | 173 | 0.989 | 98.45 | 27.86 | 60 | 70 | 80 | 95 | 110 | 130 | 143 |
hr: Heart Rate /min
n | missing | distinct | Info | Mean | Gmd | .05 | .10 | .25 | .50 | .75 | .90 | .95 |
---|---|---|---|---|---|---|---|---|---|---|---|---|
20070 | 137 | 173 | 0.996 | 104.5 | 23.38 | 70 | 80 | 90 | 105 | 120 | 130 | 140 |
rr: Respiratory Rate /min
n | missing | distinct | Info | Mean | Gmd | .05 | .10 | .25 | .50 | .75 | .90 | .95 |
---|---|---|---|---|---|---|---|---|---|---|---|---|
20016 | 191 | 68 | 0.99 | 23.06 | 7.052 | 14 | 16 | 20 | 22 | 26 | 30 | 35 |
gcs: Glasgow Coma Score Total points
n | missing | distinct | Info | Mean | Gmd | .05 | .10 | .25 | .50 | .75 | .90 | .95 |
---|---|---|---|---|---|---|---|---|---|---|---|---|
20184 | 23 | 13 | 0.863 | 12.47 | 3.594 | 4 | 6 | 11 | 15 | 15 | 15 | 15 |
Value 3 4 5 6 7 8 9 10 11 12 13 14 Frequency 784 520 441 584 733 576 504 663 586 951 1356 2140 Proportion 0.039 0.026 0.022 0.029 0.036 0.029 0.025 0.033 0.029 0.047 0.067 0.106 Value 15 Frequency 10346 Proportion 0.513
cc: Central Capillary Refille Time s
n | missing | distinct | Info | Mean | Gmd | .05 | .10 | .25 | .50 | .75 | .90 | .95 |
---|---|---|---|---|---|---|---|---|---|---|---|---|
19596 | 611 | 20 | 0.945 | 3.267 | 1.67 | 1 | 2 | 2 | 3 | 4 | 5 | 6 |
Value 1 2 3 4 5 6 7 8 9 10 11 12 Frequency 1510 5328 6020 3367 1805 802 268 271 45 139 3 7 Proportion 0.077 0.272 0.307 0.172 0.092 0.041 0.014 0.014 0.002 0.007 0.000 0.000 Value 13 15 16 17 18 20 30 60 Frequency 3 19 3 1 1 2 1 1 Proportion 0.000 0.001 0.000 0.000 0.000 0.000 0.000 0.000
7.1.3 Characteristics of injury
Characteristics of injury
2 Variables 20207 Observations
2 Variables 20207 Observations
injurytime: Hours Since Injury hours
n | missing | distinct | Info | Mean | Gmd | .05 | .10 | .25 | .50 | .75 | .90 | .95 |
---|---|---|---|---|---|---|---|---|---|---|---|---|
20196 | 11 | 93 | 0.972 | 2.844 | 2.35 | 0.5 | 1.0 | 1.0 | 2.0 | 4.0 | 6.0 | 7.0 |
injurytype: Injury type
n | missing | distinct |
---|---|---|
20207 | 0 | 3 |
Value blunt penetrating blunt and penetrating Frequency 11189 6552 2466 Proportion 0.554 0.324 0.122
7.3 Continuous variables
A closer visual examination of continuous predictors.
There is evidence of digit preference. Explore further with targeted summaries. A more detailed univariate summaries for the variables of interest are also provided below.
7.3.1 Age
## Warning: Removed 2 rows containing missing values (geom_point).
Five patients under the age of 17, the inclusion criteria for the study, with one patient aged 1.
7.4 Section session info
## R version 4.1.3 (2022-03-10)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 17763)
##
## Matrix products: default
##
## locale:
## [1] LC_COLLATE=English_Austria.1252 LC_CTYPE=English_Austria.1252
## [3] LC_MONETARY=English_Austria.1252 LC_NUMERIC=C
## [5] LC_TIME=English_Austria.1252
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] Hmisc_4.6-0 Formula_1.2-4 survival_3.2-13 lattice_0.20-45
## [5] forcats_0.5.1 stringr_1.4.0 dplyr_1.0.8 purrr_0.3.4
## [9] readr_2.1.2 tidyr_1.2.0 tibble_3.1.6 ggplot2_3.3.5
## [13] tidyverse_1.3.1 here_1.0.1
##
## loaded via a namespace (and not attached):
## [1] fs_1.5.2 lubridate_1.8.0 RColorBrewer_1.1-2
## [4] httr_1.4.2 rprojroot_2.0.2 tools_4.1.3
## [7] backports_1.4.1 bslib_0.3.1 utf8_1.2.2
## [10] R6_2.5.1 rpart_4.1.16 DBI_1.1.2
## [13] colorspace_2.0-3 nnet_7.3-17 withr_2.5.0
## [16] tidyselect_1.1.2 gridExtra_2.3 compiler_4.1.3
## [19] cli_3.2.0 rvest_1.0.2 htmlTable_2.4.0
## [22] xml2_1.3.3 labeling_0.4.2 bookdown_0.25
## [25] sass_0.4.1 scales_1.1.1 checkmate_2.0.0
## [28] digest_0.6.29 foreign_0.8-82 rmarkdown_2.13
## [31] base64enc_0.1-3 jpeg_0.1-9 pkgconfig_2.0.3
## [34] htmltools_0.5.2 highr_0.9 dbplyr_2.1.1
## [37] fastmap_1.1.0 htmlwidgets_1.5.4 rlang_1.0.2
## [40] readxl_1.3.1 rstudioapi_0.13 jquerylib_0.1.4
## [43] generics_0.1.2 farver_2.1.0 jsonlite_1.8.0
## [46] magrittr_2.0.2 patchwork_1.1.1 Matrix_1.4-0
## [49] Rcpp_1.0.8.3 munsell_0.5.0 fansi_1.0.3
## [52] lifecycle_1.0.1 stringi_1.7.6 yaml_2.3.5
## [55] grid_4.1.3 crayon_1.5.1 haven_2.4.3
## [58] splines_4.1.3 hms_1.1.1 knitr_1.38
## [61] pillar_1.7.0 reprex_2.0.1 glue_1.6.2
## [64] evaluate_0.15 latticeExtra_0.6-29 data.table_1.14.2
## [67] modelr_0.1.8 png_0.1-7 vctrs_0.3.8
## [70] tzdb_0.2.0 cellranger_1.1.0 gtable_0.3.0
## [73] assertthat_0.2.1 xfun_0.30 broom_0.7.12
## [76] cluster_2.1.2 ellipsis_0.3.2