Skip to main content

Table 4 Model results for each factor while varying all other model characteristics

From: Aligning text mining and machine learning algorithms with best practices for study selection in systematic literature reviews

 

Sensitivity

mean (SD)

Specificity

mean (SD)

Precision

mean (SD)

Accuracy

mean (SD)

Correct reason for exclusiona

mean (SD)

Citation decisions

 Abstract decisions

89.76% (10.74%)

70.50% (15.48%)

11.20% (4.53%)

71.43% (14.65%)

84.77% (6.12%)

 Full-text decisions

76.49% (15.74%)

83.07% (13.02%)

20.11% (9.49%)

83.04% (12.27%)

85.63% (8.97%)

 Modified full-text

81.37% (13.18%)

84.86% (12.07%)

24.23% (11.05%)

84.90% (11.42%)

87.52% (9.45%)

Classification algorithms

 CART

83.63% (16.56%)

70.35% (15.20%)

14.92% (7.91%)

71.00% (14.04%)

78.64% (7.72%)

 NB

79.17% (11.41%)

88.45% (6.85%)

22.59% (10.14%)

88.30% (6.42%)

91.38% (4.41%)

 SVM

87.20% (15.26%)

74.05% (16.39%)

15.43% (10.62%)

74.78% (15.67%)

84.76% (7.54%)

Feature generation

 Frequency = 5

85.67% (15.31%)

81.94% (13.09%)

20.07% (11.47%)

82.21% (12.23%)

87.40% (7.25%)

 Frequency = 10

85.54% (14.29%)

82.13% (13.27%)

20.70% (12.26%)

82.39% (12.46%)

87.23% (7.29%)

 Frequency = 100

83.52% (11.73%)

80.49% (13.38%)

18.19% (10.08%)

80.83% (12.59%)

86.82% (8.25%)

 Frequency = 500

78.81% (19.22%)

67.38% (17.50%)

10.93% (6.62%)

68.20% (16.68%)

80.31% (6.91%)

 Importance = 50

82.19% (9.54%)

85.44% (8.91%)

19.08% (7.17%)

85.51% (8.29%)

89.49% (6.29%)

 Importance = 100

83.59% (9.19%)

85.10% (10.75%)

20.49% (8.27%)

85.22% (10.06%)

89.46% (7.47%)

 Importance = 500

82.60% (10.75%)

84.73% (13.78%)

23.82% (11.18%)

84.84% (12.94%)

89.95% (7.45%)

Performance metrics

 ROC

83.90% (13.67%)

77.09% (15.32%)

17.09% (8.81%)

77.52% (14.44%)

84.02% (8.73%)

 Sensitivity

80.07% (15.43%)

83.81% (13.34%)

21.10% (12.27%)

83.90% (12.59%)

89.53% (6.23%)

Downsampling

 Without downsampling

75.46% (15.87%)

85.55% (11.24%)

23.27% (10.93%)

85.45% (10.43%)

88.93% (5.90%)

 With downsampling

89.62% (7.97%)

73.40% (15.79%)

13.76% (7.00%)

74.12% (15.06%)

83.01% (9.35%)

  1. Sensitivity = TP/(TP + FN), specificity = TN/(TN + FP), precision = TP/(TP + FP), and accuracy = (TP + TN)/(TP + FP + FN + TN); where TP (true positive) is a true included citation identified as an include or no decision, FN (false negative) is a true included citation identified as an exclude with a reason for exclusion, TN (true negative) is a true excluded citation identified as an exclude with a reason for exclusion, and FP (false positive) is a true excluded citation identified as an include or identified as having no decision
  2. aCorrect reason for exclusion was defined as the number of citations whose true reason for exclusion fell above the 90% threshold over the total number of citations with any reason for exclusion. Sensitivity, specificity, precision, and accuracy were calculated by holding each factor constant while averaging over all other model characteristics (e.g., downsampling and performance metric)
  3. Abbreviations: SD, standard deviation