Evaluation of a prototype machine learning tool to semi-automate data extraction for systematic literature reviews

Table 4 Performance of different methods using BERT for relation extraction in the SLR 2 dataset

Model	Three-sentence context window			Five-sentence context window
Model	Precision, %	Recall, %	F₁ score, %	Precision, %	Recall, %	F₁ score, %
Role labelling	61	54	57	60	52	56
Relation classification	99	89	93	99	87	92
Pretrained role labelling	64	59	62	62	55	59
Pretrained relation classification	98	91	95	98	90	94

The bold F₁ score indicates the best-performing model. The 95% confidence intervals for the F₁ scores are included within ± 0.5 percentage points of the estimates given.
BERT Bidirectional encoded representations from transformers, SLR Systematic literature review

ISSN: 2046-4053