Skip to main content
Fig. 1 | Systematic Reviews

Fig. 1

From: Screening PubMed abstracts: is class imbalance always a challenge to machine learning?

Fig. 1

Building process of the training dataset. The positive citations are papers included in a systematic review. The negative citations are papers randomly selected from those completely off-topic. To identify positive citations, we recreate the input string in the PubMed database, using keywords and filters proposed in the original systematic review. Among retrieved records (dashed green line delimited region), we retain only papers finally included in the original systematic review (solid green line delimited region). On the other side, we randomly selected the negative citations (solid blue line delimited region) from Clinical Trial article type, according to PubMed filter, that were completely off-topic, i.e., by adding the Boolean operator NOT to the input string (region between green and blue dashed lines)

Back to article page