Aligning text mining and machine learning algorithms with best practices for study selection in systematic literature reviews

Popoff, E.; Besada, M.; Jansen, J. P.; Cope, S.; Kanters, S.

doi:10.1186/s13643-020-01520-5

Systematic Reviews

Table 4 Model results for each factor while varying all other model characteristics

From: Aligning text mining and machine learning algorithms with best practices for study selection in systematic literature reviews

	Sensitivity mean (SD)	Specificity mean (SD)	Precision mean (SD)	Accuracy mean (SD)	Correct reason for exclusion^a mean (SD)
Citation decisions
Abstract decisions	89.76% (10.74%)	70.50% (15.48%)	11.20% (4.53%)	71.43% (14.65%)	84.77% (6.12%)
Full-text decisions	76.49% (15.74%)	83.07% (13.02%)	20.11% (9.49%)	83.04% (12.27%)	85.63% (8.97%)
Modified full-text	81.37% (13.18%)	84.86% (12.07%)	24.23% (11.05%)	84.90% (11.42%)	87.52% (9.45%)
Classification algorithms
CART	83.63% (16.56%)	70.35% (15.20%)	14.92% (7.91%)	71.00% (14.04%)	78.64% (7.72%)
NB	79.17% (11.41%)	88.45% (6.85%)	22.59% (10.14%)	88.30% (6.42%)	91.38% (4.41%)
SVM	87.20% (15.26%)	74.05% (16.39%)	15.43% (10.62%)	74.78% (15.67%)	84.76% (7.54%)
Feature generation
Frequency = 5	85.67% (15.31%)	81.94% (13.09%)	20.07% (11.47%)	82.21% (12.23%)	87.40% (7.25%)
Frequency = 10	85.54% (14.29%)	82.13% (13.27%)	20.70% (12.26%)	82.39% (12.46%)	87.23% (7.29%)
Frequency = 100	83.52% (11.73%)	80.49% (13.38%)	18.19% (10.08%)	80.83% (12.59%)	86.82% (8.25%)
Frequency = 500	78.81% (19.22%)	67.38% (17.50%)	10.93% (6.62%)	68.20% (16.68%)	80.31% (6.91%)
Importance = 50	82.19% (9.54%)	85.44% (8.91%)	19.08% (7.17%)	85.51% (8.29%)	89.49% (6.29%)
Importance = 100	83.59% (9.19%)	85.10% (10.75%)	20.49% (8.27%)	85.22% (10.06%)	89.46% (7.47%)
Importance = 500	82.60% (10.75%)	84.73% (13.78%)	23.82% (11.18%)	84.84% (12.94%)	89.95% (7.45%)
Performance metrics
ROC	83.90% (13.67%)	77.09% (15.32%)	17.09% (8.81%)	77.52% (14.44%)	84.02% (8.73%)
Sensitivity	80.07% (15.43%)	83.81% (13.34%)	21.10% (12.27%)	83.90% (12.59%)	89.53% (6.23%)
Downsampling
Without downsampling	75.46% (15.87%)	85.55% (11.24%)	23.27% (10.93%)	85.45% (10.43%)	88.93% (5.90%)
With downsampling	89.62% (7.97%)	73.40% (15.79%)	13.76% (7.00%)	74.12% (15.06%)	83.01% (9.35%)

Sensitivity = TP/(TP + FN), specificity = TN/(TN + FP), precision = TP/(TP + FP), and accuracy = (TP + TN)/(TP + FP + FN + TN); where TP (true positive) is a true included citation identified as an include or no decision, FN (false negative) is a true included citation identified as an exclude with a reason for exclusion, TN (true negative) is a true excluded citation identified as an exclude with a reason for exclusion, and FP (false positive) is a true excluded citation identified as an include or identified as having no decision
^aCorrect reason for exclusion was defined as the number of citations whose true reason for exclusion fell above the 90% threshold over the total number of citations with any reason for exclusion. Sensitivity, specificity, precision, and accuracy were calculated by holding each factor constant while averaging over all other model characteristics (e.g., downsampling and performance metric)
Abbreviations: SD, standard deviation

Back to article page

ISSN: 2046-4053

Contact us

Submission enquiries: Access here and click Contact Us
General enquiries: info@biomedcentral.com