Chapter 7 Univariate distribution checks

This section reports a series of univariate summary checks of the CRASH-2 dataset.

7.1 Data set overview

Using the Hmisc describe function, we provide an overview of the data set. The descriptive report also provides histograms of continuous variables. For ease of scanning the information, we group the report by measurement type.

7.1.1 Demographic variables

Demographic variables

2 Variables   20207 Observations

age: Age years
image
nmissingdistinctInfoMeanGmd.05.10.25.50.75.90.95
202034840.99934.5615.5518192430435564
lowest : 1 14 15 16 17 , highest: 92 94 95 96 99
sex: Sex
nmissingdistinct
2020612
 Value        male female
 Frequency   16935   3271
 Proportion  0.838  0.162
 

7.1.2 Physiological measurements

Physiological measurements

5 Variables   20207 Observations

sbp: Systolic Blood Pressure mmHg
image
nmissingdistinctInfoMeanGmd.05.10.25.50.75.90.95
198873201730.98998.4527.86 60 70 80 95110130143
lowest : 4 10 12 20 25 , highest: 225 230 234 240 250
hr: Heart Rate /min
image
nmissingdistinctInfoMeanGmd.05.10.25.50.75.90.95
200701371730.996104.523.38 70 80 90105120130140
lowest : 3 4 5 6 10 , highest: 190 192 198 200 220
rr: Respiratory Rate /min
image
nmissingdistinctInfoMeanGmd.05.10.25.50.75.90.95
20016191680.9923.067.05214162022263035
lowest : 1 2 3 4 5 , highest: 90 91 94 95 96
gcs: Glasgow Coma Score Total points
image
nmissingdistinctInfoMeanGmd.05.10.25.50.75.90.95
2018423130.86312.473.594 4 61115151515
lowest : 3 4 5 6 7 , highest: 11 12 13 14 15
 Value          3     4     5     6     7     8     9    10    11    12    13    14
 Frequency    784   520   441   584   733   576   504   663   586   951  1356  2140
 Proportion 0.039 0.026 0.022 0.029 0.036 0.029 0.025 0.033 0.029 0.047 0.067 0.106
                 
 Value         15
 Frequency  10346
 Proportion 0.513
 

cc: Central Capillary Refille Time s
image
nmissingdistinctInfoMeanGmd.05.10.25.50.75.90.95
19596611200.9453.2671.671223456
lowest : 1 2 3 4 5 , highest: 17 18 20 30 60
 Value          1     2     3     4     5     6     7     8     9    10    11    12
 Frequency   1510  5328  6020  3367  1805   802   268   271    45   139     3     7
 Proportion 0.077 0.272 0.307 0.172 0.092 0.041 0.014 0.014 0.002 0.007 0.000 0.000
                                                           
 Value         13    15    16    17    18    20    30    60
 Frequency      3    19     3     1     1     2     1     1
 Proportion 0.000 0.001 0.000 0.000 0.000 0.000 0.000 0.000
 

7.1.3 Characteristics of injury

Characteristics of injury

2 Variables   20207 Observations

injurytime: Hours Since Injury hours
image
nmissingdistinctInfoMeanGmd.05.10.25.50.75.90.95
2019611930.9722.8442.350.51.01.02.04.06.07.0
lowest : 0.10 0.15 0.20 0.25 0.30 , highest: 22.00 45.00 48.00 72.00 96.00
injurytype: Injury type
image
nmissingdistinct
2020703
 Value                      blunt           penetrating blunt and penetrating
 Frequency                  11189                  6552                  2466
 Proportion                 0.554                 0.324                 0.122
 

7.2 Categorical variables

We now provide a closer visual examination of the categorical predictors.

7.2.1 Categorical ordinal plots

The Glasgow coma score, an ordinal categorical variable, is also displayed separately.

7.3 Continuous variables

A closer visual examination of continuous predictors.

There is evidence of digit preference. Explore further with targeted summaries. A more detailed univariate summaries for the variables of interest are also provided below.

7.3.1 Age

## Warning: Removed 2 rows containing missing values (geom_point).
Distribution of subject age [years]

Figure 7.1: Distribution of subject age [years]

Five patients under the age of 17, the inclusion criteria for the study, with one patient aged 1.

7.3.2 Blood pressure

## Warning: Removed 1 rows containing missing values (geom_point).
Distribution of SBP

Figure 7.2: Distribution of SBP

7.3.3 Respiratory rate

Distribution of respiratory rate

Figure 7.3: Distribution of respiratory rate

7.3.4 Heart rate

Distribution of heart rate

Figure 7.4: Distribution of heart rate

7.3.5 Central capillary refill time

## Warning: Removed 728 rows containing missing values (geom_point).
Distribution of Central capillary refill time

Figure 7.5: Distribution of Central capillary refill time

7.3.6 Hours since injury

## Warning: Removed 24 rows containing missing values (geom_point).
Distribution of hours since injury

Figure 7.6: Distribution of hours since injury

7.4 Section session info

## R version 4.1.3 (2022-03-10)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 17763)
## 
## Matrix products: default
## 
## locale:
## [1] LC_COLLATE=English_Austria.1252  LC_CTYPE=English_Austria.1252   
## [3] LC_MONETARY=English_Austria.1252 LC_NUMERIC=C                    
## [5] LC_TIME=English_Austria.1252    
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] Hmisc_4.6-0     Formula_1.2-4   survival_3.2-13 lattice_0.20-45
##  [5] forcats_0.5.1   stringr_1.4.0   dplyr_1.0.8     purrr_0.3.4    
##  [9] readr_2.1.2     tidyr_1.2.0     tibble_3.1.6    ggplot2_3.3.5  
## [13] tidyverse_1.3.1 here_1.0.1     
## 
## loaded via a namespace (and not attached):
##  [1] fs_1.5.2            lubridate_1.8.0     RColorBrewer_1.1-2 
##  [4] httr_1.4.2          rprojroot_2.0.2     tools_4.1.3        
##  [7] backports_1.4.1     bslib_0.3.1         utf8_1.2.2         
## [10] R6_2.5.1            rpart_4.1.16        DBI_1.1.2          
## [13] colorspace_2.0-3    nnet_7.3-17         withr_2.5.0        
## [16] tidyselect_1.1.2    gridExtra_2.3       compiler_4.1.3     
## [19] cli_3.2.0           rvest_1.0.2         htmlTable_2.4.0    
## [22] xml2_1.3.3          labeling_0.4.2      bookdown_0.25      
## [25] sass_0.4.1          scales_1.1.1        checkmate_2.0.0    
## [28] digest_0.6.29       foreign_0.8-82      rmarkdown_2.13     
## [31] base64enc_0.1-3     jpeg_0.1-9          pkgconfig_2.0.3    
## [34] htmltools_0.5.2     highr_0.9           dbplyr_2.1.1       
## [37] fastmap_1.1.0       htmlwidgets_1.5.4   rlang_1.0.2        
## [40] readxl_1.3.1        rstudioapi_0.13     jquerylib_0.1.4    
## [43] generics_0.1.2      farver_2.1.0        jsonlite_1.8.0     
## [46] magrittr_2.0.2      patchwork_1.1.1     Matrix_1.4-0       
## [49] Rcpp_1.0.8.3        munsell_0.5.0       fansi_1.0.3        
## [52] lifecycle_1.0.1     stringi_1.7.6       yaml_2.3.5         
## [55] grid_4.1.3          crayon_1.5.1        haven_2.4.3        
## [58] splines_4.1.3       hms_1.1.1           knitr_1.38         
## [61] pillar_1.7.0        reprex_2.0.1        glue_1.6.2         
## [64] evaluate_0.15       latticeExtra_0.6-29 data.table_1.14.2  
## [67] modelr_0.1.8        png_0.1-7           vctrs_0.3.8        
## [70] tzdb_0.2.0          cellranger_1.1.0    gtable_0.3.0       
## [73] assertthat_0.2.1    xfun_0.30           broom_0.7.12       
## [76] cluster_2.1.2       ellipsis_0.3.2