1  Bacteremia study

1.1 Bacteremia study overview

We will exemplify our proposed systematic approach to data screening by means of a diagnostic study with the primary aim of using age, sex and 49 laboratory variables to fit a diagnostic prediction model for the bacteremia status (= presence of bacteria in the blood stream) of a blood sample. A secondary aim of the study is to describe the functional form of each predictor in the model. Between January 2006 and December 2010, patients with the clinical suspicion to suffer from bacteremia were included if blood culture analysis was requested by the responsible physician and blood was sampled for assessment of hematology and biochemistry. An analysis of this study can be found in Ratzinger et al. (2014).

The data consists of 14,691 observations from different patients and 51 potential predictors. To protect data privacy our version of this data was slightly modified compared to the original, and this modified version was cleared by the Medical University of Vienna for public use (DC 2019-0054). Compared to the official results given in (Ratzinger et al. 2014), our results may differ to a negligible degree.

1.2 Where to access the data?

We refer to the source data as the raw data set available in this repository (DC 2019-0054). The data set is published on Zenodo with the following doi: https://doi.org/10.5281/zenodo.7554815.

For simplicity, we have also stored the source data and accompanying materials such as the data dictionary in the data-raw folder. The data dictionary provides an overview of the collected source data - see Appendix C.1 for further details. Within the appendix, we also display a short snapshot of source data set from the data-raw folder of the project directory. The snapshot provides a glimpse of the data and the data dictionary for more context. However, we refer to the Zenodo page for an interactive overview of the source data.