Chapter 4 Introduction to CRASH-2

Since a key principle of IDA is not to touch the research questions, before IDA commences the research aim and statistical analysis plan need to be in place. IDA may lead to an update or refinement of the analysis plan. To demonstrate the workflow and content of IDA, we created a hypothetical research aim and corresponding statistical analysis plan, which is described in more detail in the section Crash2_SAP.Rmd.

Hypothetical research aim for IDA is to develop a multivariable model for early death (death within 28 days from injury) using nine independent variables of mixed type (continuous, categorical, semicontinuous) with the primary aim of prediction and a secondary aim of describing the association of each variable with the outcome.

A prediction model was developed and validated based on this data set in “Predicting early death in patients with traumatic bleeding” Perel et al, BMJ 2012, [supplement available at]. The assumed research aim is in line with the prediction model

4.1 CRASH-2 Description

Clinical Randomisation of an Antifibrinolyticin Significant Haemorrhage(CRASH-2) was a large randomised placebo controlled trial among trauma patients with, or at risk of, significant haemorrhage, of the effects of antifibrinolytic treatment on death and transfusion requirement. The study is described at the original trial website. A public version of the data set is found at a repository of public data sets hosted by the Vanderbilt University’s Department of Biostatistics (Prof. Frank Harrell Jr.).

The data set includes 20,207 patients and 44 variables.

Note: In contrast to the analysis described in Perel et al, variables describing the economic region and the treatment allocation are missing in the public version of the data set, and while the data set contains 20,207 patients, the research paper mentions 20,127 patients having been included in the study.

4.2 Crash2 dataset contents

4.2.1 Source dataset

We refer to the source data set as the dataset available online here

Display the source dataset contents. This dataset is in the data-raw folder of the project directory.


Data frame:crash2

20207 observations and 44 variables, maximum # NAs:17121  
NameLabelsUnitsLevelsClassStorageNAs
entryidUnique Numbers for Entry Formsintegerinteger 0
sourceMethod of Transmission of Entry Form to CC 5integer 0
trandomisedDate of RandomizationDatedouble 0
outcomeidUnique Number From Outcome Databaseintegerinteger 80
sex 2integer 1
ageinteger 4
injurytimeHours Since Injurynumericdouble 11
injurytype 3integer 0
sbpSystolic Blood PressuremmHgintegerinteger 320
rrRespiratory Rate/minintegerinteger 191
ccCentral Capillary Refille Timesintegerinteger 611
hrHeart Rate/minintegerinteger 137
gcseyeGlasgow Coma Score Eye Openingintegerinteger 732
gcsmotorGlasgow Coma Score Motor Responseintegerinteger 732
gcsverbalGlasgow Coma Score Verbal Responseintegerinteger 735
gcsGlasgow Coma Score Totalintegerinteger 23
ddeathDate of DeathDatedouble17121
causeMain Cause of Death 7integer17118
scauseotherDescription of Other Cause of Death227integer 0
statusStatus of Patient at Outcome if Alive 3integer 3169
ddischargeDate of discharge, transfer to other hospital or day 28 from randomizationDatedouble 3185
conditionCondition of Patient at Outcome if Alive 5integer 3251
ndaysicuNumber of Days Spent in ICUnumericdouble 182
bheadinjSignificant Head Injuryintegerinteger 80
bneuroNeurosurgery Doneintegerinteger 80
bchestChest Surgery Doneintegerinteger 80
babdomenAbdominal Surgery Doneintegerinteger 80
bpelvisPelvis Surgery Doneintegerinteger 80
bpePulmonary Embolismintegerinteger 80
bdvtDeep Vein Thrombosisintegerinteger 80
bstrokeStrokeintegerinteger 80
bbleedSurgery for Bleedingintegerinteger 80
bmiMyocardial Infarctionintegerinteger 80
bgiGastrointestinal Bleedingintegerinteger 80
bloadingComplete Loading Dose of Trial Drug Givenintegerinteger 80
bmaintComplete Maintenance Dose of Trial Drug Givenintegerinteger 80
btransfBlood Products Transfusionintegerinteger 80
ncellNumber of Units of Red Call Products Transfusednumericdouble 9963
nplasmaNumber of Units of Fresh Frozen Plasma Transfusedintegerinteger 9964
nplateletsNumber of Units of Platelets Transfusedintegerinteger 9964
ncryoNumber of Units of Cryoprecipitate Transfusedintegerinteger 9964
bviiRecombinant Factor VIIa Givenintegerinteger 374
boxidTreatment Box Numberintegerinteger 0
packnumTreatment Pack Numberintegerinteger 0

VariableLevels
sourcetelephone
telephone entered manually
electronic CRF by email
paper CRF enteredd in electronic CRF
electronic CRF
sexmale
female
injurytypeblunt
penetrating
blunt and penetrating
causebleeding
head injury
myocardial infarction
stroke
pulmonary embolism
multi organ failure
other
scauseother
Acute Hypoxia
ACUTE LUNG INJURY
Acute Pulmonary Oedema
Acute Renal Failure
ACUTE RESPIRATORY DISTRESS SYNDROME (ARDS)
acute respiratory failure
acute respiratory failure+sepsis
air amboli (embolism)
Air embolism caused by penetrating lung trauma
...
statusdischarged
still in hospital
transferred to other hospital
conditionno symptoms
minor symptoms
some restriction in lifestyle but independent
dependent, but not requiring constant attention
fully dependent, requiring attention day and night

4.2.2 Updated analysis dataset

Additional meta-data is added to the original source data set. We write this new modified data set back to the data folder after adding additional meta-data for the following variables:

  • age - add label “Age” and unit “years”.
  • injury time - add unit “hours”.
  • total Glasgow coma score - add unit “points”.

At the stage we select the variables of interest to take in to the IDA phase by dropping variables we do not check in IDA.

As a cross check we display the contents again to ensure the additional data is added, and then write back the changes to the data folder in the file “data/a_crash2.rds”.

Input object size: 1221480 bytes; 12 variables 20207 observations New object size: 1223272 bytes; 12 variables 20207 observations Input object size: 1546808 bytes; 14 variables 20207 observations New object size: 1385720 bytes; 14 variables 20207 observations


Data frame:a_crash2

20207 observations and 14 variables, maximum # NAs:17121  
NameLabelsUnitsLevelsClassStorageNAs
entryidUnique Numbers for Entry Formsintegerinteger 0
trandomisedDate of RandomizationDatedouble 0
ddeathDate of DeathDatedouble17121
ageAgeyearsintegerinteger 4
sexSex2integer 1
sbpSystolic Blood PressuremmHgintegerinteger 320
hrHeart Rate/minintegerinteger 137
rrRespiratory Rate/minintegerinteger 191
gcsGlasgow Coma Score Totalpointsintegerinteger 23
ccCentral Capillary Refille Timesintegerinteger 611
injurytimeHours Since Injuryhoursnumericdouble 11
injurytypeInjury type3integer 0
time2deathinteger17121
earlydeathDeath within 28 days from injuryintegerinteger 0

VariableLevels
sexmale
female
injurytypeblunt
penetrating
blunt and penetrating

4.3 Section session info

## R version 4.1.3 (2022-03-10)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 17763)
## 
## Matrix products: default
## 
## locale:
## [1] LC_COLLATE=English_Austria.1252  LC_CTYPE=English_Austria.1252   
## [3] LC_MONETARY=English_Austria.1252 LC_NUMERIC=C                    
## [5] LC_TIME=English_Austria.1252    
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] Hmisc_4.6-0     Formula_1.2-4   survival_3.2-13 lattice_0.20-45
##  [5] forcats_0.5.1   stringr_1.4.0   dplyr_1.0.8     purrr_0.3.4    
##  [9] readr_2.1.2     tidyr_1.2.0     tibble_3.1.6    ggplot2_3.3.5  
## [13] tidyverse_1.3.1 here_1.0.1     
## 
## loaded via a namespace (and not attached):
##  [1] httr_1.4.2          sass_0.4.1          jsonlite_1.8.0     
##  [4] splines_4.1.3       modelr_0.1.8        bslib_0.3.1        
##  [7] assertthat_0.2.1    latticeExtra_0.6-29 cellranger_1.1.0   
## [10] yaml_2.3.5          pillar_1.7.0        backports_1.4.1    
## [13] glue_1.6.2          digest_0.6.29       checkmate_2.0.0    
## [16] RColorBrewer_1.1-2  rvest_1.0.2         colorspace_2.0-3   
## [19] htmltools_0.5.2     Matrix_1.4-0        pkgconfig_2.0.3    
## [22] broom_0.7.12        haven_2.4.3         bookdown_0.25      
## [25] scales_1.1.1        jpeg_0.1-9          tzdb_0.2.0         
## [28] htmlTable_2.4.0     generics_0.1.2      ellipsis_0.3.2     
## [31] withr_2.5.0         nnet_7.3-17         cli_3.2.0          
## [34] magrittr_2.0.2      crayon_1.5.1        readxl_1.3.1       
## [37] evaluate_0.15       fs_1.5.2            fansi_1.0.3        
## [40] xml2_1.3.3          foreign_0.8-82      data.table_1.14.2  
## [43] tools_4.1.3         hms_1.1.1           lifecycle_1.0.1    
## [46] munsell_0.5.0       reprex_2.0.1        cluster_2.1.2      
## [49] compiler_4.1.3      jquerylib_0.1.4     rlang_1.0.2        
## [52] grid_4.1.3          rstudioapi_0.13     htmlwidgets_1.5.4  
## [55] base64enc_0.1-3     rmarkdown_2.13      gtable_0.3.0       
## [58] DBI_1.1.2           R6_2.5.1            gridExtra_2.3      
## [61] lubridate_1.8.0     knitr_1.38          fastmap_1.1.0      
## [64] utf8_1.2.2          rprojroot_2.0.2     stringi_1.7.6      
## [67] Rcpp_1.0.8.3        vctrs_0.3.8         rpart_4.1.16       
## [70] png_0.1-7           dbplyr_2.1.1        tidyselect_1.1.2   
## [73] xfun_0.30