Estimating Sample Sizes for a Binomial Distribution.

Imagine you have a critical component that you know will fail in 1 in N "uses" (for some suitable definition of "use"). You may want to schedule routine replacement of the component so that its chance of failure between routine replacements is less than P%. If the failures follow a binomial distribution (each time the component is "used" it either fails or does not) then the static member function binomial_distibution<>::find_maximum_number_of_trials can be used to estimate the maximum number of "uses" of that component for some acceptable risk level alpha.

The example program binomial_sample_sizes.cpp demonstrates its usage. It centres on a routine that prints out a table of maximum sample sizes for various probability thresholds:

void find_max_sample_size(
   double p,              // success ratio.
   unsigned successes)    // Total number of observed successes permitted.
{

The routine then declares a table of probability thresholds: these are the maximum acceptable probability that successes or fewer events will be observed. In our example, successes will be always zero, since we want no component failures, but in other situations non-zero values may well make sense.

double alpha[] = { 0.5, 0.25, 0.1, 0.05, 0.01, 0.001, 0.0001, 0.00001 };

Much of the rest of the program is pretty-printing, the important part is in the calculation of maximum number of permitted trials for each value of alpha:

for(unsigned i = 0; i < sizeof(alpha)/sizeof(alpha[0]); ++i)
{
   // Confidence value:
   cout << fixed << setprecision(3) << setw(10) << right << 100 * (1-alpha[i]);
   // calculate trials:
   double t = binomial::find_maximum_number_of_trials(
                  successes, p, alpha[i]);
   t = floor(t);
   // Print Trials:
   cout << fixed << setprecision(5) << setw(15) << right << t << endl;
}

Note that since we're calculating the maximum number of trials permitted, we'll err on the safe side and take the floor of the result. Had we been calculating the minimum number of trials required to observe a certain number of successes using find_minimum_number_of_trials we would have taken the ceiling instead.

We'll finish off by looking at some sample output, firstly for a 1 in 1000 chance of component failure with each use:

________________________
Maximum Number of Trials
________________________

Success ratio                           =  0.001
Maximum Number of "successes" permitted =  0


____________________________
Confidence        Max Number
 Value (%)        Of Trials
____________________________
    50.000            692
    75.000            287
    90.000            105
    95.000             51
    99.000             10
    99.900              0
    99.990              0
    99.999              0

So 51 "uses" of the component would yield a 95% chance that no component failures would be observed.

Compare that with a 1 in 1 million chance of component failure:

________________________
Maximum Number of Trials
________________________

Success ratio                           =  0.0000010
Maximum Number of "successes" permitted =  0


____________________________
Confidence        Max Number
 Value (%)        Of Trials
____________________________
    50.000         693146
    75.000         287681
    90.000         105360
    95.000          51293
    99.000          10050
    99.900           1000
    99.990            100
    99.999             10

In this case, even 1000 uses of the component would still yield a less than 1 in 1000 chance of observing a component failure (i.e. a 99.9% chance of no failure).

Boost C++ Libraries

Estimating Sample Sizes for a Binomial Distribution.