Introduction

The use of therapeutic intervention scoring systems has become more and more important as a tool for estimating workload in the ICU [1], and for comparing ICU performance in quality assurance projects [2]. Because the TISS-28 (Therapeutic Intervention Scoring System) [3] was considered to be too long and time consuming, the Nine Equivalents of Nursing Manpower use Score (NEMS), derived from the TISS-28 [4], was developed. Besides scoring the nursing workload on the ICU comparable to the TISS-28 in large samples of patients, the NEMS has proven to correlate with severity of illness (as measured by the SAPS II score). NEMS is currently used for multi-center ICU studies, management purposes, and comparison of workload at the ICU level.

The objective of this prospective study was to establish a fully automatic calculation of the NEMS using a PDMS database and to compare the results with those of the conventional manual method.

Material and methods

This prospective study was performed in a 14-bed surgical ICU of a university hospital. Patients under the age of 16 years, undergoing cardiovascular surgery, or with burn injuries were excluded from evaluation. The NEMS [4] (Table 1) of every study patient was calculated manually by a physician (experienced consultant in anesthesia and intensive care) on all working days at 8 a.m. by means of data from the preceding 24 h. In advance, he had to complete a standardized training program in NEMS scoring. The NEMS was calculated using a paper chart containing instructions for the NEMS. For conventional scoring all information available on the ICU including the digital patient chart of the PDMS were accessible for the data collector. Furthermore, the acting physician was trained in advance to locate and find the required data in the electronic patient chart to calculate the NEMS properly.

Table 1 Items of the Nine Equivalents of Nursing Manpower use Score (NEMS) [4]

For the assessment of the time needed for NEMS scoring, a digital timer was used. Time was started when the investigator entered the patient room in the ICU and stopped when he left it. Thus, the time needed for the complete scoring process, including investigation of the digital patient chart, and communication with nurses, physicians, and the patient, was recorded.

The design of the database (Oracle 7, Oracle) of the PDMS ICUData (IMESO, Hüttenberg, Germany) varies from a stringent relational design in the administrative sections to an entity attribute value (EAV) model for storing medical data items. At first, a list containing more detailed and exact definitions of every item was generated so expanding the original NEMS definitions. This list was also part of the NEMS instructions for manual scoring. The scripts for automatic extraction of relevant data from the database and computerized score calculation were written in Structured Query Language (SQL). The computerized NEMS scores were obtained by developing a map between the original nine items and the corresponding charged items in the database according to the previously generated detailed list of definitions. In this way all nine items were detected out of the database including basic monitoring (online monitoring via RS232 interface), intravenous medication, mechanical ventilatory support (online monitoring via RS232 interface), supplementary ventilatory care, vasoactive medication, dialysis techniques, routine and specific interventions in and outside the ICU (surgical procedures were documented with an anesthesia information management system and consecutively imported into the PDMS database) according to the definitions of the original work.

The results of the two methods (Consultant and PDMS) were compared using the Bland and Altman method [5] and the interclass correlation coefficient (ICC) [6]. Furthermore, inter-rater variability for the nine categorical variables of the NEMS was measured computing the percentage of agreement (and disagreement), and the kappa index of concordance, κ. For statistical analysis the SPSS program (SPSS Software, Munich, Germany) was used.

Results

On 20 consecutive working days between 24 July 2002 and 22 August 2002 the NEMS was calculated in 204 cases independently by the two methods. No missing data were found for the human rater. The Bland Altman analysis did not show significant differences in NEMS scoring between the two methods. The bias between the two different scoring methods as determined by the Bland Altman approach is given in Fig. 1. The mean difference was −0.03 (SD: 3.8; 95% CI [−7.6; 7.6]; range [−14; 12]) with a P-value of 0.90, indicating a non-significant difference from zero of the regression coefficient. The regression analysis of the differences between the scoring methods led to r= −0.21 (r 2=0.046) with a P-value of P=0.002, indicating a linear correlation of the scoring results.

Fig. 1
figure 1

Bland and Altman plot of the difference between the NEMS results obtained by computerized scoring (PDMS) and that obtained by conventional manual scoring performed by a consultant. Mean difference is represented by the solid line; 95% confidence limits (CL) are represented by the dashed lines (mean±2 SDs). A P-value of 0.90 indicates a non-significant difference from zero of the regression coefficient. The comparison of every NEMS result is represented as a dot in the figure. The dimension of each dot reflects the number of identical pairs (same score result obtained by the PDMS and same score result obtained by the manual scoring), such that not all 204 cases are actually visible in the figure

Overall inter-rater agreement in NEMS scores was high with ICC (95% confidence interval) = 0.87 (0.84–0.90). Absolute number of combinations, inter-rater agreement rates, and the results of the κ-statistic of all NEMS items as evaluated by the two scoring methods are presented in Table 2. The inter-rater agreement showed good results (κ>0.55) for all items apart from item 4 (“supplementary ventilatory care”, Table 2).

Table 2 Absolute number of combinations, inter-rater agreement rates, and reliability coefficients for NEMS (Nine Equivalents of Nursing Manpower use Score) according to the two scoring methods. In total, 204 NEMS were calculated, each manually by a consultant, and automatically with the patient data management system (PDMS)

The time needed for manual scoring per patient was 56 [47; 70] s, median [range]. During the study the average scoring time per patient decreased from 85 [64; 117] s to 51 [44; 59] s.

Discussion

Measuring health-care costs or resource use is a complicated task and both resource use and clinical outcome are difficult and expensive to measure. This requires the development of measurement tools that are practical, uniform, reproducible, and of sufficient detail to allow comparison among institutions, among selected groups of patients, and among individual patients [7]. The use of a PDMS can support and ease these tasks. The study performed by Clermont et al. [7] represents the first attempt to generate computerized TISS scores from existing hospital billing data sets. More recent PDM systems offer various kinds of “automatic score calculation” [8]. However, in all these automatic calculation systems, part of the data or even all values must still be entered manually using a special interface in addition to the routine documentation. The idea of our study was to generate a completely automatic NEMS calculation out of the PDMS database containing routine data, without the necessity for entering additional values. The study is based on previously published experience in computerized scoring [9].

The availability of an accurate algorithm that generates NEMS scores across all ICU patients would provide a useful tool in the assessment of quality and cost-effectiveness of care. This study demonstrates the successful development of such a tool. Although manual scoring required an acceptable time per patient and day (approx. 1 min), it is an additional workload, which easily may sum up to 30 min per day.

In our study the detection of the nine items was based on exactly defined algorithms, and because of its technical nature was not influenced by inter-observer and intra-observer variability [10, 11]. Existing score systems require well-trained personnel who are acquainted with the patient’s chart when calculating scores manually. An increased number of mistakes in the manual recording of physiological values was shown in the study by Hammond et al. [12], who found an almost 20% rate of erroneous entries.

There are some study limitations that have to be addressed. One problem may be the quality of data. The extent to which there are inaccuracies due to disrupted information contained in our database is unknown. Possible problems with our system include patient ventilation data (items “mechanical ventilatory support” and “supplementary ventilatory care”). For example, patients are recorded as being ventilated even if they are not in case of data recording during the bedside ventilator test by the nurses. However, simple SQL scripts in our study rejecting ventilatory data without concomitant hemodynamic data helped to avoid contamination of scoring. Another possible problem is missing data caused by non-documentation by the medical staff or by technical defects with the online connection. This accounts for the differences between the PDMS and physician’s rating according to supplemental ventilatory support. Oxygen via face mask or via a nasal probe is a basic measure in critical care, which is applied to the majority of patients. As this is not incorporated into the PDMS automatically and has to be entered manually, a lack of documentation may easily occur. However, as “supplementary ventilatory care” is scored with only 3 points in the NEMS, an inadequate value of this item does not lead to substantial errors in NEMS ratings.

Conclusion

In summary, in this study we could demonstrate that it is possible to construct an automatically calculated NEMS computation exclusively using data collected with a PDMS leading to results comparable to those of the conventional manual approach. The advantage of the described method is that no additional manual data recording is required for score calculation and it can be performed without additional manpower and time resources.