Skip to main content

Table 2 The estimated number of unknown duplicates based on a random sample from each title similarity score range. Confidence intervals for the percentage of hidden duplicates based on the exact binomial confidence interval for the proportion of duplicates in the sample

From: Previously unidentified duplicate registrations of clinical trials: an exploratory analysis of registry data worldwide

Score range D. in sample D. known D. unknown (est.) % hidden
0.7<x≤0.8 7 / 125 (5.6 %) 2194 1957 47 (26–64)
0.8<x≤0.9 13 / 100 (13 %) 3489 2265 39 (26–51)
0.9<x≤1.0 89 / 209 (43 %) 5805 5393 48 (44–52)