Academic and commercial sequencing facilities alike have used a common quality rating to determine the accuracy of sequencing data. The generally used term is Phred score. It was developed by Phil Green and Brent Ewing during the 1990s. Often the scores are listed as a Q value where Q20 score is considered an acceptably accurate base call. But what exactly does a Q20 value mean?
Q is a value derived from the formula q=-10 log p where the value of p is probability. The algorithm uses the base peak quality of the individual base as well as the bases before and after that individual base.
Q value is simply the probability that a base has been called correctly based on a scale from 10 to 60 as shown…
Q10 = 90% certainty (1/10 chance of an incorrect base call)
Q20 = 99% certainty (1/100 chance of an incorrect base call)
Q30 = 99.9% certainty (1/1,000 chance of an incorrect base call)
Q40 = 99.99% certainty (1/10,000 chance of an incorrect base call)
Q50 = 99.999% certainty (1/100,000 chance of an incorrect base call)
Q60 = 99.9999% certainty (1/1,000,000 chance of an incorrect base call)
Q20 is the acceptable score for most sequencing data. It indicates a 99% certainty that the base has been called correctly. This is considered high quality data and the standard value commonly used by sequencing facilities.