Example of Shelf Life Calculation with No Variation


Based on the requirement that the three batches exhibit similarity (no significant difference), the stability data can be combined (pooled) to determine a single, unified shelf life.

The FDA guideline specifies that the expiration dating period (shelf life, $\xi$) is determined as the time point at which the $95%$ one-sided lower confidence limit for the mean degradation curve intersects the acceptable lower specification limit ($\eta$).

Here is a simulated example demonstrating this process for three similar batches ($K=3$).

1. Simulated Stability Study Data and Parameters

Objective: Determine the shelf life ($\xi$) for a drug product using three validation batches. Acceptable Lower Specification Limit ($\eta$): $90%$ of label claim. Model: Linear degradation ($Y = \alpha + \beta X + \epsilon$). Time Points ($X_j$): 0, 3, 6, 9, and 12 months ($n=5$ time points). Total Observations ($N$): $K \times n = 3 \times 5 = 15$.

The observed Potency (% Label Claim) data are simulated to be consistent with a common degradation rate of approximately $-0.5%$ per month, indicating high similarity across batches:

Batch (i)Time $X_j$ (Months)Potency $Y_{i,j}$ (%)
10100.2
398.6
697.1
995.3
1294.1
2099.9
398.3
696.9
995.6
1293.8
30100.0
398.5
697.0
995.4
1294.2

2. Preliminary Test for Batch Similarity

A preliminary statistical test for batch similarity (equality of slopes and intercepts) is conducted at a significance level of $0.25$.

Assumption: The statistical test demonstrates that the three batches are statistically similar (the null hypothesis of no difference in slopes and intercepts is not rejected). This justifies pooling the $N=15$ data points into one overall analysis.

3. Statistical Calculation (Pooled Data)

The Ordinary Least Squares (OLS) method is applied to the combined data set to estimate the common intercept ($\hat{\alpha}$) and common slope ($\hat{\beta}$).

ParameterCalculation Result (Pooled Data)
Mean Time ($\overline{X}$)6.0 months
Pooled Sum of Squares of X ($K\sum_{j=1}^{n}(x_{j}-\overline{x})^{2}$)90
Estimated Intercept ($\hat{\alpha}$)$100.40$ (Potency %)
Estimated Slope ($\hat{\beta}$)$-0.50$ ($-%$ per month)
Mean Squared Error (MSE)$0.038$
Degrees of Freedom (N-2)13
$t$-value ($t(0.95, 13)$)$\approx 1.771$

The pooled mean degradation curve is: $\hat{Y}(X) = 100.40 - 0.50 X$

4. Determination of Tentative Shelf Life ($\xi$)

The tentative shelf life ($\xi$) is the solution to the equation where the lower $95%$ confidence bound intersects the lower specification limit ($\eta=90$):

$$ \eta = \hat{\alpha} + \hat{\beta}\xi - t(.95)S(\xi) $$

Where $S(\xi)$ is the standard error of the estimated mean degradation curve at time $\xi$:

$$S^{2}(\xi) = \text{MSE} \left\{ \frac{1}{N} + \frac{(\xi-\overline{X})^{2}}{K\sum_{j=1}^{n}(x_{j}-\overline{x})^{2}} \right\}$$

Substituting the calculated pooled values:

$$ 90 = 100.40 - 0.50\xi - 1.771 \sqrt{0.038 \left( \frac{1}{15} + \frac{(\xi-6)^{2}}{90} \right)} $$

Solving this equation for $\xi$ yields the estimated shelf life:

$$ \hat{\xi} \approx 20.1 \text{ months} $$

5. Conclusion

The estimated tentative shelf life is $\mathbf{20.1}$ months.

Since the batches were determined to be similar, pooling the data was justified, resulting in a narrower confidence limit due to the larger degrees of freedom ($N-2=13$) and improved precision. This yielded a statistically determined shelf life of $20.1$ months, based on the time point where the lower $95%$ confidence boundary for the mean degradation profile of the combined batches meets the $90%$ specification limit. 


Ps: I am using NotebookLM to create this blog.

How to calcuate drug shelf life

 Shelf life, or the expiration dating period, is defined as the interval that a drug product is expected to remain within the approved specifications after manufacture. The calculation of the shelf life is the primary objective of a stability study.

The general method for determining the shelf life, as recommended by the FDA and ICH guidelines, involves statistical analysis of stability data:

Primary Calculation Method (Long-Term Stability)

The shelf life is determined as the time point at which the 95% one-sided lower confidence limit for the mean degradation curve intersects the acceptable lower specification limit ($\tau_{\eta}$).

  1. Modeling Degradation: The stability data, typically using percent of label claim as the primary variable, are fitted to a mathematical relationship.

    • The degradation relationship can usually be represented by a linear, quadratic, or cubic function on an arithmetic or logarithmic scale.
    • For characteristics expected to decrease (e.g., strength), the 95% one-sided lower confidence limit is used.
    • For characteristics expected to increase (e.g., degradation products), the 95% one-sided upper confidence limit is used.
  2. Statistical Calculation (Linear Model): Assuming the strength decreases linearly over time (a zero-order reaction), the expected degradation is modeled by linear regression, $E(Y_{j}) = \alpha + \beta\lambda_{j}$.

    • The shelf life ($x_{L}$) is calculated by solving the quadratic equation that results from setting the 95% lower confidence limit for the mean degradation line, $L(x)$, equal to the lower specification limit, $\tau_{\eta}$. $x_{L}$ is the smaller root of this equation.
    • It is not acceptable to determine the expiration dating period by simply finding where the fitted least-squares line intersects the specification limit (which would only provide a 50% confidence level).

Handling Multiple Batches

When multiple batches (a minimum of three) are tested, the approach depends on batch-to-batch variability:

  • Pooling Data: If analysis shows that the batch-to-batch variability is small (e.g., slopes and intercepts are sufficiently similar, sometimes assessed using a significance level of 0.25), the data from different batches may be combined into one overall estimate to establish a single, more precise shelf life.
  • Minimum Approach (Fixed Effects): If it is inappropriate to combine data due to significant batch-to-batch variability, the overall expiration dating period may be based on the minimum of the individual shelf lives estimated from each batch. This is considered a conservative estimate.
  • Random Batch Effects (Advanced Methods): For establishing a shelf life applicable to all future production batches, statistical methods incorporating random batch effects are used (e.g., Chow and Shao's approach or the HLC method). These methods include the between-batch variability when constructing the confidence limit for the mean degradation curve.

Tentative Shelf Life (Accelerated Testing)

Accelerated stability testing (or stress testing) is used primarily to predict a tentative expiration dating period in a shorter timeframe by increasing the rate of chemical or physical degradation under exaggerated conditions.

The prediction relies on kinetic models:

  1. Reaction Order: The analysis involves empirically determining the order of the reaction (e.g., zero-order for linear degradation or first-order for logarithmic degradation).
  2. Arrhenius Equation: The relationship between the degradation rate and temperature is characterized using the Arrhenius equation.
  3. Extrapolation: The tentative shelf life is obtained by extrapolating the relationship to ambient (marketing) storage conditions.

"lower.tail" confusion in R.

 "lower.tail" in R 

I usually get confused on how to use the argument in "pt" function and similar function. Here I will focus on t-distribution. I will utilize Minitab for graphical presentation.

A:- lower.tail is FALSE 

    In R

The code is 
> pt(q = -2.262, df = 9, lower.tail = FALSE)
The output 
[1] 0.9749936

    In Minitab

This is shown as in the graph from Minitab.



So when FASLE is chosen, the calculation will give the area after the critical value.

B:- "lower.tail" is TRUE

    In R 

the code is 
 pt(q = -2.262, df = 9, lower.tail = TRUE)
the output is 
[1] 0.02500642

    In Minitab


When TRUE is used, it orders R to compute the area before the critical value.

How to compute the probability between two values using t-distribution in R.

 To compute the "P" between two cut offs in t-distribution (two points) in R.

The example uses the d.f. = 9 i.e. n = 10 , first quantile -2.262, the second quantile is 2.262.

I used Minitab to give a graphical representation of that as below:-



The code in R to use is as follow:- 

pt(q = 2.262, df = 9, lower.tail = TRUE) - pt(q = -2.262, df = 9, lower.tail = TRUE)

I recomend to play a little with above code, to find out the argument "lower.tail" , when it is TRUE and FALSE.

The output from R is :-

[1] 0.9499872

Which when rounded, it will be 0.95 as Minitab.

P.S 

d.f. ; degree of freedom.

Example of Shelf Life Calculation with No Variation

Based on the requirement that the three batches exhibit similarity (no significant difference), the stability data can be combined (pooled) ...