Index
daspi.statistics.confidence
¶
Confidence interval functions for common statistical measures.
This module provides two-sided confidence interval calculations for a
range of statistics, covering both single-sample and two-sample
scenarios. All interval functions return a three-element tuple in the
form (point_estimate, lower_bound, upper_bound), making them
straightforward to use in tables and plots.
Available functions
Single-sample intervals:
mean_ci– confidence interval for the sample mean (t-distribution)median_ci– confidence interval for the sample medianvariance_ci– confidence interval for the variance (χ²-distribution)stdev_ci– confidence interval for the standard deviationproportion_ci– confidence interval for a binomial proportion
Process-capability intervals:
cp_ci– confidence interval for the Cp process-capability indexcpk_ci– confidence interval for the Cpk process-capability index
Two-sample / difference intervals:
delta_mean_ci– confidence interval for the difference of two meansdelta_variance_ci– confidence interval for the ratio of two variancesdelta_stdev_ci– confidence interval for the ratio of two standard deviationsdelta_proportions_ci– confidence interval for the difference of two proportions
Regression / model intervals:
fit_ci– confidence band around a fitted OLS regression lineprediction_ci– prediction band for individual future observations
Helpers / utilities:
sem– standard error of the meanbonferroni_ci– group-wise confidence intervals with Bonferroni correctionconfidence_to_alpha– convert a confidence level to the corresponding α
Notes
The Bonferroni correction adjusts each individual confidence level so that the family-wise error rate across n simultaneous intervals does not exceed the nominal α.
References
Comprehensive confidence intervals for Python developers: https://aegis4048.github.io/comprehensive_confidence_intervals_for_python_developers
mean_ci(sample, level=0.95, n_groups=1)
¶
Two sided confidence interval for mean of data.
| PARAMETER | DESCRIPTION |
|---|---|
sample
|
A one-dimensional array-like object containing the samples.
TYPE:
|
level
|
confidence level, by default 0.95
TYPE:
|
n_groups
|
Used for Bonferroni method. Amount of groups to adjust the alpha risk within each group, that the total risk is not exceeded, by default 1
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
x_bar
|
expected value
TYPE:
|
lower
|
Lower confidence level
TYPE:
|
upper
|
Upper confidence levell
TYPE:
|
Notes
The underlying t.interval function assumes that the data follows a
t-distribution. Additionally, this method assumes that the sample is
representative of the population and that the data is independent
and identically distributed.
median_ci(sample, level=0.95, n_groups=1)
¶
Two sided confidence interval for median of data
| PARAMETER | DESCRIPTION |
|---|---|
sample
|
A one-dimensional array-like object containing the samples.
TYPE:
|
level
|
confidence level, by default 0.95
TYPE:
|
n_groups
|
Used for Bonferroni method. Amount of groups to adjust the alpha risk within each group, that the total risk is not exceeded, by default 1
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
median
|
median of data
TYPE:
|
lower
|
Lower confidence level
TYPE:
|
upper
|
Upper confidence levell
TYPE:
|
Notes
The underlying t.interval function assumes that the data follows a
t-distribution. Additionally, this method assumes that the sample is
representative of the population and that the data is independent
and identically distributed.
variance_ci(sample, level=0.95, n_groups=1)
¶
Two sided confidence interval for variance of data
| PARAMETER | DESCRIPTION |
|---|---|
sample
|
A one-dimensional array-like object containing the samples.
TYPE:
|
level
|
confidence level, by default 0.95
TYPE:
|
n_groups
|
Used for Bonferroni method. Amount of groups to adjust the alpha risk within each group, that the total risk is not exceeded, by default 1
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
s2
|
variance of data
TYPE:
|
lower
|
Lower confidence level
TYPE:
|
upper
|
Upper confidence levell
TYPE:
|
stdev_ci(sample, level=0.95, n_groups=1)
¶
Two sided confidence interval for standard deviation of data
| PARAMETER | DESCRIPTION |
|---|---|
sample
|
A one-dimensional array-like object containing the samples.
TYPE:
|
level
|
confidence level, by default 0.95
TYPE:
|
n_groups
|
Used for Bonferroni method. Amount of groups to adjust the alpha risk within each group, that the total risk is not exceeded, by default 1
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
s
|
variance of data
TYPE:
|
lower
|
Lower confidence level
TYPE:
|
upper
|
Upper confidence levell
TYPE:
|
proportion_ci(events, observations, level=0.95, n_groups=1)
¶
Confidence interval for a binomial proportion with a asymptotic normal approximation.
| PARAMETER | DESCRIPTION |
|---|---|
events
|
Counted number of events.
TYPE:
|
observations
|
Total number of observations.
TYPE:
|
level
|
Confidence level, default 0.95
TYPE:
|
n_groups
|
Used for Bonferroni method. Amount of groups to adjust the alpha risk within each group, that the total risk is not exceeded, by default 1
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
portion
|
Portion as ratio events/observations.
TYPE:
|
lower, upper : float
|
The lower and upper confidence level with coverage approximately ci. |
bonferroni_ci(data, target, feature, level=0.95, ci_func=stdev_ci, n_groups=None, name='midpoint')
¶
Calculate confidence interval after bonferroni correction. The Bonferroni correction is a method to adjust the significance level alpha.
| PARAMETER | DESCRIPTION |
|---|---|
data
|
data frame containing sample and feature data
TYPE:
|
target
|
name of target sample data column
TYPE:
|
feature
|
name of categorical feature. The confidence intervals are calculated separately for these groups
TYPE:
|
level
|
confidence level, default 0.95
TYPE:
|
ci_func
|
function to calculate needed confidence interval that returns the values in order: midpoint, lower ci, upper ci
TYPE:
|
n_groups
|
Used for Bonferroni correction. Amount of groups to adjust the alpha risk within each group, that the total risk is not exceeded, If none is given, it calculates the number based on the given groups (ngroups attribute of groupby object), by default None
TYPE:
|
name
|
name of midpoints, by default 'midpoint'
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
data
|
data containing groups, midpoints and confidence limits
TYPE:
|
Notes
The Bonferroni correction is always necessary if you carry out several "multiple" tests. In this case, the probability of the type I error for all tests together is no longer 5% (or 1%), but significantly more. This means that the risk that you will receive at least one significant result, even though there is no effect at all, is significantly increased with multiple tests. This is also referred to as alpha error accumulation or alpha inflation.
delta_mean_ci(sample1, sample2, level=0.95)
¶
Two sided confidence interval for mean difference of two independent variables.
| PARAMETER | DESCRIPTION |
|---|---|
sample1
|
A one-dimensional array-like object containing the first samples.
TYPE:
|
sample2
|
A one-dimensional array-like object containing the second samples.
TYPE:
|
level
|
confidence level between 0 and 1, by default 0.95
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
delta
|
Difference of means of data
TYPE:
|
lower
|
Lower confidence level
TYPE:
|
upper
|
Upper confidence levell
TYPE:
|
delta_variance_ci(sample1, sample2, level=0.95)
¶
two sided confidence interval for variance difference of two independent variables.
| PARAMETER | DESCRIPTION |
|---|---|
sample1
|
A one-dimensional array-like object containing the first sample.
TYPE:
|
sample2
|
A one-dimensional array-like object containing the second sample.
TYPE:
|
level
|
confidence level between 0 and 1, by default 0.95
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
delta
|
difference of variance of data
TYPE:
|
lower
|
Lower confidence level
TYPE:
|
upper
|
Upper confidence levell
TYPE:
|
Notes
This function is a ChatGPT solution and therefore does not guarantee that this solution is correct.
delta_proportions_ci(events1, observations1, events2, observations2, level=0.95)
¶
Confidence intervals for comparing two independent proportions This assumes that we have two independent binomial sample.
| PARAMETER | DESCRIPTION |
|---|---|
events1
|
Counted number of events of sample 1.
TYPE:
|
observations1
|
Total number of observations of sample 1.
TYPE:
|
events2
|
Counted number of events of sample 2.
TYPE:
|
observations2
|
Total number of observations of sample 2.
TYPE:
|
level
|
Confidence level, by default 0.95
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
delta
|
Difference of variance of data
TYPE:
|
lower
|
Lower confidence level
TYPE:
|
upper
|
Upper confidence levell
TYPE:
|
fit_ci(model, level=0.95)
¶
calculate confidence interval fitted line. Applies to fitted WLS and OLS models, not to general GLS
| PARAMETER | DESCRIPTION |
|---|---|
model
|
fitted OLS or WLS model
TYPE:
|
level
|
confidence level, by default 0.95
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
fitted
|
For coherence with the other functions, the fitted target samples are returned as one-dimensional numpy array,
TYPE:
|
lower
|
Lower confidence limits of fitting line as one-dimensional numpy array.
TYPE:
|
upper
|
Upper confidence limits of fitting line as one-dimensional numpy array.
TYPE:
|
Notes
Using hat_matrix to calculate fit_se only works for fitted values
This function is based on the summary_table function from the statsmodels.stats.outliers_influence module, see: https://www.statsmodels.org/dev/_modules/statsmodels/stats/outliers_influence.html
prediction_ci(model, level=0.95)
¶
calculate confidence interval for prediction and to observe outliers. Applies to fitted WLS and OLS models, not to general GLS.
| PARAMETER | DESCRIPTION |
|---|---|
model
|
fitted OLS or WLS model
TYPE:
|
level
|
confidence level, by default 0.95
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
fitted
|
For coherence with the other functions, the fitted target samples are returned as one-dimensional numpy array,
TYPE:
|
lower
|
Lower confidence limits of prediction as one-dimensional numpy array.
TYPE:
|
upper
|
Upper confidence limits of prediction as one-dimensional numpy array.
TYPE:
|
confidence_to_alpha(confidence_level, two_sided=True, n_groups=1)
¶
Calculate significance level as alpha risk by given confidence level
| PARAMETER | DESCRIPTION |
|---|---|
confidence_level
|
level of confidence interval
TYPE:
|
two_sided
|
True if alpha is to be calculated for a two-sided confidence interval, by default True
TYPE:
|
n_groups
|
Used for Bonferroni method. Number of groups to adjust the alpha risk within each group, that the total risk is not exceeded, by default 1
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
alpha
|
significance level as alpha risk
TYPE:
|