Determining Sample Size for Demonstrating Zero Failures: Ensuring the Effectiveness of Corrective and Preventive Actions

5 Min Read

Biopharmaceutical manufacturers often implement corrective and preventive actions (CAPAs) as a mitigation strategy to address deviations based on root causes identified in investigations of process failures. CAPA implementation frequently is followed by effectiveness checks. The effectiveness of a CAPA is demonstrated by ensuring that a specific parameter can meet acceptable requirements.

If a monitored parameter for ensuring CAPA effectiveness does not follow any specific data distribution, or if it reports results as pass or fail, then the statistical approach for selection of sample sizes to ensure reliability of the effectiveness check will not necessarily be straightforward. Therefore, herein we detail one such appropriate statistical approach to calculate sample size: “Reliability demonstration tests” (RDTs) can be used to verify whether a product or process has met a certain reliability requirement at a stated statistical confidence level when the underlying data distribution is unknown.

A number of different RDT approaches are available for process engineers to choose from, including cumulative binomial, nonparametric binomial, exponential chi-squared, and nonparametric Bayesian methods. Here we focus on two relatively simple options that are easy to adopt: nonparametric binomial and exponential chi-squared methods. The others have limitations such as a requirement for feeding prior information based on assumptions. Because there is no single “correct way” of calculating prior information, using different approaches can give different results. Finally, we recommend a risk-assessment methodology for selecting acceptable reliability and statistical confidence levels that may be used in the two RDT methods highlighted.


Equations 1–5

Nonparametric Binomial Method

This method is also known as the classical reliability demonstration test. When x is the number of observed failures out of i trials, and the probability of each trial p is an independent Bernoulli trial, then the probability of at least c failures can be calculated using a binomial distribution (Equation 1). For zero failures (assuming c = 0), that formula reduces to Equation 2, which can be rewritten as Equation 3. Now the number of sample sizes can be determined by equating Equation 3 to a significance value (α) type 1 error — in other words, the probability of falsely rejecting a true null hypothesis (Equation 4). Taking logarithms on both sides in that equation and solving for sample size n, you obtain Equation 5, in which the term C is confidence. Thus, to have C% confidence when the CAPA is R% reliable, a minimum of n samples needs to be tested to ensure zero failure as an acceptance criteria.


Exponential Chi-Squared Method

In the exponential chi-squared method, you also can calculate the sample size for zero failure as a function of reliability and confidence, as shown in Equation 6:

where χ2(1–C,2) represents the inverse of the right-tailed probability of a chi-squared distribution at

1 – C probability and two degrees of freedom. In a Microsoft Excel spreadsheet, that value can be calculated using this function:

=CHISQ.INV.RT(probability, degrees of freedom).

Risk Assessment: The American Society for Testing and Materials (ASTM) outlines three criteria for use in creating a risk-assessment matrix for attributes/parameters to be monitored: the severity of the consequences if a problem or failure occurs (S), the likelihood of occurrence (O), and the likelihood of detection (of a problem or of a detection failure, D) (1). Each criterion can be scored on a scale from 1 to 5, as shown in Table 1 (2).

The product of those three scores — (S) × (O) × (D) — gives a risk priority number (RPN). Based on the RPN, the risk associated with a given attribute or parameter is classified as high, medium, or low (Table 2). For each risk classification, Table 2 also lists recommendations for confidence (C) and reliability (R). The latter is based on rules of thumb for Cronbach’s alpha coefficient, which measures the reliability or internal consistency of a measurement method (1).

Confidence and Reliability

We recommended documenting the statistical methods and rationality for selecting sample sizes to ensure that a CAPA is effective. For determining sample sizes, both nonparametric binomial and exponential chi-squared methods are widely accepted for practice in industries regulated by the US Food and Drug Administration (FDA). As Table 2 shows, calculated sample sizes do not differ significantly between these two approaches. Thus, either method can be used.

Note that confidence and reliability values herein are for illustration purposes only. Based on the nature of a given process, batch run rate, and associated costs, we recommend that drug manufacturers use the most suitable confidence and reliability values for adherence to their own quality policies.


1 Muralidharan N. Process Validation: Calculating the Necessary Number of Process Performance Qualification Runs. BioProcess Int. 21(5) 2023: 37–43;

2 Salkind NJ, Ed. Encyclopedia of Measurement and Statistics. Sage Publications: London, UK, 2007;

Further Reading

Gorski A. CHI-Square Probabilities Are Poisson Probabilities in Disguise. IEEE Transactions on

Reliability. R-34(3) 1985: 209–211;

Romeu JL. Determining the Experimental Sample Size. Quality, Reliability and Continuous Improvement Institute: Syracuse, NY;

Corresponding author Naveenganesh Muralidharan is senior manager of manufacturing science and technology (MSAT), and Dan Larson is manager of investigations, both at AGC Biologics, 5550 Airport Boulevard, Boulder, CO 80301; [email protected];

You May Also Like