Index

`daspi.statistics.confidence` ¶

Confidence interval functions for common statistical measures.

This module provides two-sided confidence interval calculations for a range of statistics, covering both single-sample and two-sample scenarios. All interval functions return a three-element tuple in the form (point_estimate, lower_bound, upper_bound), making them straightforward to use in tables and plots.

Available functions

Single-sample intervals:

mean_ci – confidence interval for the sample mean (t-distribution)
median_ci – confidence interval for the sample median
variance_ci – confidence interval for the variance (χ²-distribution)
stdev_ci – confidence interval for the standard deviation
proportion_ci – confidence interval for a binomial proportion

Process-capability intervals:

cp_ci – confidence interval for the Cp process-capability index
cpk_ci – confidence interval for the Cpk process-capability index

Two-sample / difference intervals:

delta_mean_ci – confidence interval for the difference of two means
delta_variance_ci – confidence interval for the ratio of two variances
delta_stdev_ci – confidence interval for the ratio of two standard deviations
delta_proportions_ci – confidence interval for the difference of two proportions

Regression / model intervals:

fit_ci – confidence band around a fitted OLS regression line
prediction_ci – prediction band for individual future observations

Helpers / utilities:

sem – standard error of the mean
bonferroni_ci – group-wise confidence intervals with Bonferroni correction
confidence_to_alpha – convert a confidence level to the corresponding α

Notes

The Bonferroni correction adjusts each individual confidence level so that the family-wise error rate across n simultaneous intervals does not exceed the nominal α.

References

Comprehensive confidence intervals for Python developers: https://aegis4048.github.io/comprehensive_confidence_intervals_for_python_developers

`mean_ci(sample, level=0.95, n_groups=1)` ¶

Two sided confidence interval for mean of data.

PARAMETER	DESCRIPTION
`sample`	A one-dimensional array-like object containing the samples. TYPE: `NumericSample1D`
`level`	confidence level, by default 0.95 TYPE: `float in (0, 1)` DEFAULT: `0.95`
`n_groups`	Used for Bonferroni method. Amount of groups to adjust the alpha risk within each group, that the total risk is not exceeded, by default 1 TYPE: `int` DEFAULT: `1`

RETURNS	DESCRIPTION
`x_bar`	expected value TYPE: `float`
`lower`	Lower confidence level TYPE: `float`
`upper`	Upper confidence levell TYPE: `float`

Notes

The underlying t.interval function assumes that the data follows a t-distribution. Additionally, this method assumes that the sample is representative of the population and that the data is independent and identically distributed.

`median_ci(sample, level=0.95, n_groups=1)` ¶

Two sided confidence interval for median of data

PARAMETER	DESCRIPTION
`sample`	A one-dimensional array-like object containing the samples. TYPE: `NumericSample1D`
`level`	confidence level, by default 0.95 TYPE: `float in (0, 1)` DEFAULT: `0.95`
`n_groups`	Used for Bonferroni method. Amount of groups to adjust the alpha risk within each group, that the total risk is not exceeded, by default 1 TYPE: `int` DEFAULT: `1`

RETURNS	DESCRIPTION
`median`	median of data TYPE: `float`
`lower`	Lower confidence level TYPE: `float`
`upper`	Upper confidence levell TYPE: `float`

Notes

The underlying t.interval function assumes that the data follows a t-distribution. Additionally, this method assumes that the sample is representative of the population and that the data is independent and identically distributed.

`variance_ci(sample, level=0.95, n_groups=1)` ¶

Two sided confidence interval for variance of data

PARAMETER	DESCRIPTION
`sample`	A one-dimensional array-like object containing the samples. TYPE: `NumericSample1D`
`level`	confidence level, by default 0.95 TYPE: `float in (0, 1)` DEFAULT: `0.95`
`n_groups`	Used for Bonferroni method. Amount of groups to adjust the alpha risk within each group, that the total risk is not exceeded, by default 1 TYPE: `int` DEFAULT: `1`

RETURNS	DESCRIPTION
`s2`	variance of data TYPE: `float`
`lower`	Lower confidence level TYPE: `float`
`upper`	Upper confidence levell TYPE: `float`

`stdev_ci(sample, level=0.95, n_groups=1)` ¶

Two sided confidence interval for standard deviation of data

PARAMETER	DESCRIPTION
`sample`	A one-dimensional array-like object containing the samples. TYPE: `NumericSample1D`
`level`	confidence level, by default 0.95 TYPE: `float in (0, 1)` DEFAULT: `0.95`
`n_groups`	Used for Bonferroni method. Amount of groups to adjust the alpha risk within each group, that the total risk is not exceeded, by default 1 TYPE: `int` DEFAULT: `1`

RETURNS	DESCRIPTION
`s`	variance of data TYPE: `float`
`lower`	Lower confidence level TYPE: `float`
`upper`	Upper confidence levell TYPE: `float`

`proportion_ci(events, observations, level=0.95, n_groups=1)` ¶

Confidence interval for a binomial proportion with a asymptotic normal approximation.

PARAMETER	DESCRIPTION
`events`	Counted number of events. TYPE: `int`
`observations`	Total number of observations. TYPE: `int`
`level`	Confidence level, default 0.95 TYPE: `float in (0, 1)` DEFAULT: `0.95`
`n_groups`	Used for Bonferroni method. Amount of groups to adjust the alpha risk within each group, that the total risk is not exceeded, by default 1 TYPE: `int` DEFAULT: `1`

RETURNS	DESCRIPTION
`portion`	Portion as ratio events/observations. TYPE: `float`
`lower, upper : float`	The lower and upper confidence level with coverage approximately ci.

`bonferroni_ci(data, target, feature, level=0.95, ci_func=stdev_ci, n_groups=None, name='midpoint')` ¶

Calculate confidence interval after bonferroni correction. The Bonferroni correction is a method to adjust the significance level alpha.

PARAMETER	DESCRIPTION
`data`	data frame containing sample and feature data TYPE: `DataFrame`
`target`	name of target sample data column TYPE: `str`
`feature`	name of categorical feature. The confidence intervals are calculated separately for these groups TYPE: `str \| List[str]`
`level`	confidence level, default 0.95 TYPE: `float in (0, 1)` DEFAULT: `0.95`
`ci_func`	function to calculate needed confidence interval that returns the values in order: midpoint, lower ci, upper ci TYPE: `(mean_ci, stdev_ci, variance_ci)` DEFAULT: `mean_ci`
`n_groups`	Used for Bonferroni correction. Amount of groups to adjust the alpha risk within each group, that the total risk is not exceeded, If none is given, it calculates the number based on the given groups (ngroups attribute of groupby object), by default None TYPE: `int` DEFAULT: `None`
`name`	name of midpoints, by default 'midpoint' TYPE: `str` DEFAULT: `'midpoint'`

RETURNS	DESCRIPTION
`data`	data containing groups, midpoints and confidence limits TYPE: `DataFrame`

Notes

The Bonferroni correction is always necessary if you carry out several "multiple" tests. In this case, the probability of the type I error for all tests together is no longer 5% (or 1%), but significantly more. This means that the risk that you will receive at least one significant result, even though there is no effect at all, is significantly increased with multiple tests. This is also referred to as alpha error accumulation or alpha inflation.

`delta_mean_ci(sample1, sample2, level=0.95)` ¶

Two sided confidence interval for mean difference of two independent variables.

PARAMETER	DESCRIPTION
`sample1`	A one-dimensional array-like object containing the first samples. TYPE: `NumericSample1D`
`sample2`	A one-dimensional array-like object containing the second samples. TYPE: `NumericSample1D`
`level`	confidence level between 0 and 1, by default 0.95 TYPE: `float in (0, 1)` DEFAULT: `0.95`

RETURNS	DESCRIPTION
`delta`	Difference of means of data TYPE: `float`
`lower`	Lower confidence level TYPE: `float`
`upper`	Upper confidence levell TYPE: `float`

`delta_variance_ci(sample1, sample2, level=0.95)` ¶

two sided confidence interval for variance difference of two independent variables.

PARAMETER	DESCRIPTION
`sample1`	A one-dimensional array-like object containing the first sample. TYPE: `NumericSample1D`
`sample2`	A one-dimensional array-like object containing the second sample. TYPE: `NumericSample1D`
`level`	confidence level between 0 and 1, by default 0.95 TYPE: `float in (0, 1)` DEFAULT: `0.95`

RETURNS	DESCRIPTION
`delta`	difference of variance of data TYPE: `float`
`lower`	Lower confidence level TYPE: `float`
`upper`	Upper confidence levell TYPE: `float`

Notes

This function is a ChatGPT solution and therefore does not guarantee that this solution is correct.

`delta_proportions_ci(events1, observations1, events2, observations2, level=0.95)` ¶

Confidence intervals for comparing two independent proportions This assumes that we have two independent binomial sample.

PARAMETER	DESCRIPTION
`events1`	Counted number of events of sample 1. TYPE: `int`
`observations1`	Total number of observations of sample 1. TYPE: `int`
`events2`	Counted number of events of sample 2. TYPE: `int`
`observations2`	Total number of observations of sample 2. TYPE: `int`
`level`	Confidence level, by default 0.95 TYPE: `float in (0, 1)` DEFAULT: `0.95`

RETURNS	DESCRIPTION
`delta`	Difference of variance of data TYPE: `float`
`lower`	Lower confidence level TYPE: `float`
`upper`	Upper confidence levell TYPE: `float`

`fit_ci(model, level=0.95)` ¶

calculate confidence interval fitted line. Applies to fitted WLS and OLS models, not to general GLS

PARAMETER	DESCRIPTION
`model`	fitted OLS or WLS model TYPE: `statsmodels RegressionResults`
`level`	confidence level, by default 0.95 TYPE: `float in (0, 1)` DEFAULT: `0.95`

RETURNS	DESCRIPTION
`fitted`	For coherence with the other functions, the fitted target samples are returned as one-dimensional numpy array, TYPE: `NDArray`
`lower`	Lower confidence limits of fitting line as one-dimensional numpy array. TYPE: `NDArray`
`upper`	Upper confidence limits of fitting line as one-dimensional numpy array. TYPE: `NDArray`

Notes

Using hat_matrix to calculate fit_se only works for fitted values

This function is based on the summary_table function from the statsmodels.stats.outliers_influence module, see: https://www.statsmodels.org/dev/_modules/statsmodels/stats/outliers_influence.html

`prediction_ci(model, level=0.95)` ¶

calculate confidence interval for prediction and to observe outliers. Applies to fitted WLS and OLS models, not to general GLS.

PARAMETER	DESCRIPTION
`model`	fitted OLS or WLS model TYPE: `statsmodels RegressionResults`
`level`	confidence level, by default 0.95 TYPE: `float in (0, 1)` DEFAULT: `0.95`

RETURNS	DESCRIPTION
`fitted`	For coherence with the other functions, the fitted target samples are returned as one-dimensional numpy array, TYPE: `NDArray`
`lower`	Lower confidence limits of prediction as one-dimensional numpy array. TYPE: `NDArray`
`upper`	Upper confidence limits of prediction as one-dimensional numpy array. TYPE: `NDArray`

`confidence_to_alpha(confidence_level, two_sided=True, n_groups=1)` ¶

Calculate significance level as alpha risk by given confidence level

PARAMETER	DESCRIPTION
`confidence_level`	level of confidence interval TYPE: `float in (0, 1)`
`two_sided`	True if alpha is to be calculated for a two-sided confidence interval, by default True TYPE: `bool` DEFAULT: `True`
`n_groups`	Used for Bonferroni method. Number of groups to adjust the alpha risk within each group, that the total risk is not exceeded, by default 1 TYPE: `int` DEFAULT: `1`

RETURNS	DESCRIPTION
`alpha`	significance level as alpha risk TYPE: `float`

Index

daspi.statistics.confidence ¶

mean_ci(sample, level=0.95, n_groups=1) ¶

median_ci(sample, level=0.95, n_groups=1) ¶

variance_ci(sample, level=0.95, n_groups=1) ¶

stdev_ci(sample, level=0.95, n_groups=1) ¶

proportion_ci(events, observations, level=0.95, n_groups=1) ¶

bonferroni_ci(data, target, feature, level=0.95, ci_func=stdev_ci, n_groups=None, name='midpoint') ¶

delta_mean_ci(sample1, sample2, level=0.95) ¶

delta_variance_ci(sample1, sample2, level=0.95) ¶

delta_proportions_ci(events1, observations1, events2, observations2, level=0.95) ¶

fit_ci(model, level=0.95) ¶

prediction_ci(model, level=0.95) ¶

confidence_to_alpha(confidence_level, two_sided=True, n_groups=1) ¶

`daspi.statistics.confidence` ¶

`mean_ci(sample, level=0.95, n_groups=1)` ¶

`median_ci(sample, level=0.95, n_groups=1)` ¶

`variance_ci(sample, level=0.95, n_groups=1)` ¶

`stdev_ci(sample, level=0.95, n_groups=1)` ¶

`proportion_ci(events, observations, level=0.95, n_groups=1)` ¶

`bonferroni_ci(data, target, feature, level=0.95, ci_func=stdev_ci, n_groups=None, name='midpoint')` ¶

`delta_mean_ci(sample1, sample2, level=0.95)` ¶

`delta_variance_ci(sample1, sample2, level=0.95)` ¶

`delta_proportions_ci(events1, observations1, events2, observations2, level=0.95)` ¶

`fit_ci(model, level=0.95)` ¶

`prediction_ci(model, level=0.95)` ¶

`confidence_to_alpha(confidence_level, two_sided=True, n_groups=1)` ¶