Index

`daspi.statistics.estimation` ¶

Statistical estimation classes and functions.

This module provides higher-level estimators that combine confidence intervals, hypothesis tests, and distribution fitting into coherent analysis objects. It also contains utility functions for non-parametric smoothing and kernel density estimation.

Estimator classes

All estimator classes share a common interface: they accept a sample (and optionally a specification or reference distribution), run a battery of statistical checks internally, and expose their results as plain attributes.

BaseEstimator – abstract base class defining the common interface.
LocationDispersionEstimator – estimates mean and standard deviation, computes confidence intervals for both, and performs normality, stability, and shape tests on the sample.
DistributionEstimator – fits a parametric SciPy distribution to the data via maximum-likelihood and performs a Kolmogorov-Smirnov goodness-of-fit test.
ProcessEstimator – extends LocationDispersionEstimator with process-capability indices (Cp, Cpk, Cpm) and their confidence intervals, given a Specification.
GageEstimator – measurement system analysis; combines multiple ProcessEstimator instances to quantify measurement uncertainty relative to process variation and tolerance (GUM / Gage R&R style).

Standalone functions

root_sum_squares – root sum of squares of scalar values; used in combined measurement uncertainty calculations.
estimate_distribution – fits a parametric distribution to a sample and returns the frozen distribution together with fit diagnostics.
estimate_kernel_density – univariate kernel density estimate over a grid.
estimate_kernel_density_2d – bivariate kernel density estimate on a 2-D grid.
estimate_capability_confidence – delta-method confidence interval for process-capability indices using Monte Carlo bootstrap.
estimate_resolution – estimates the effective measurement resolution from a sample.

Smoothing

Loess – locally weighted polynomial regression (LOESS) for univariate data.
Lowess – locally weighted scatterplot smoothing (LOWESS) using the statsmodels implementation.

Measurement uncertainty

MeasurementUncertainty – GUM-compliant representation of a single uncertainty contribution; supports rectangular, triangular, and normal distributions and can be combined with other instances via root-sum-of-squares.

Notes

The ProcessEstimator and GageEstimator classes depend on Specification / SpecLimits from the montecarlo module, and on the hypothesis-testing functions from the hypothesis module and the confidence interval functions from the confidence module. They are therefore imported from those modules rather than being reimplemented here.

`MeasurementUncertainty(*, standard=None, expanded=None, error_limit=None, distribution_factor=None, k=2, confidence_level=None, distribution='rectangular')` ¶

A class to represent and calculate measurement uncertainty.

This class provides multiple ways to define measurement uncertainty: 1. From error limit and distribution factor 2. From expanded uncertainty and coverage factor k 3. From standard uncertainty directly

PARAMETER	DESCRIPTION
`standard`	The standard uncertainty (u). If provided the parameters expanded and error_limit are ignored. To initialize a non-significant measurement uncertainty, set standard to 0. TYPE: `float` DEFAULT: `None`
`error_limit`	The maximum allowable deviation from the true value, also known as the tolerance range. This parameter represents the worst-case scenario for measurement error, indicating how much the measured value can differ from the actual value. It is used to calculate the standard uncertainty based on the specified distribution factor. The value must be positive, as a negative error limit does not have a physical meaning in the context of measurement uncertainty. TYPE: `float` DEFAULT: `None`
`distribution_factor`	The distribution factor based on the assumed distribution. Common values: - √3 ≈ 1.732 for rectangular (uniform) distribution - 2 for triangular distribution - 1 for normal distribution (if error_limit is already 1σ) TYPE: `float` DEFAULT: `None`
`expanded`	The expanded uncertainty (U). TYPE: `float` DEFAULT: `None`
`k`	The coverage factor for expanded uncertainty. It is used as a multiplier to determine the expanded uncertainty based on the standard uncertainty. The value of `k` is typically set to reflect the desired confidence level in the measurement results. Default is 2, typical values are: - k=2 corresponds to a confidence interval of 95.45% - k=3 corresponds to a confidence interval of 99.73% TYPE: `int \| float` DEFAULT: `2`
`confidence_level`	The confidence level (0 to 1) to calculate coverage factor for normal distribution. Default is 0.95 (95% confidence). TYPE: `float` DEFAULT: `None`
`distribution`	The assumed probability distribution for calculating distribution factor. Only used if distribution_factor is not explicitly provided. Default is 'rectangular'. TYPE: `(rectangular, triangular, normal)` DEFAULT: `'rectangular'`

Notes

To initialize a non-significant measurement uncertainty, set standard to 0. This uncertainty can then be used for further calculations and combined with others, but it does not affect the "addition" of uncertainties.

Examples:

Create uncertainty from error limit (rectangular distribution):

# Error limit ±0.1, rectangular distribution
u_1 = dsp.MeasurementUncertainty(error_limit=0.1)
print(f"Standard uncertainty: {u_1.standard:.4f}")

Create uncertainty from expanded uncertainty:

# Expanded uncertainty U = 0.2 with k = 2
u_2 = dsp.MeasurementUncertainty(
    expanded=0.2, k=2)
print(f"Standard uncertainty: {u_2.standard:.4f}")

Create uncertainty directly:

# Direct standard uncertainty
u_3 = dsp.MeasurementUncertainty(standard=0.05)
print(f"Expanded uncertainty (k=2): {u_3.expanded(2):.4f}")

RAISES	DESCRIPTION
`ValueError`	If insufficient or conflicting parameters are provided.
`AssertionError`	If parameter values are invalid (negative, out of range, etc.).

`standard` `property` ¶

Get the standard uncertainty (u) (read-only).

`confidence_level` `property` ¶

Get the confidence level used for calculations (read-only).

`k` `property` ¶

Get the coverage factor k used in uncertainty calculations (read-only).

This property returns the coverage factor, which is a multiplier used to determine the expanded uncertainty based on the standard uncertainty. The value of k is typically set to reflect the desired confidence level in the measurement results.

`expanded` `property` ¶

Get expanded uncertainty. If it was not provided during initialization, it will be calculated from the standard uncertainty and coverage factor k (U = k × u) (read-only).

`error_limit` `property` ¶

Get the error limit associated with the measurement uncertainty.

This property returns the maximum allowable deviation from the true value, which is also known as the tolerance range. If the error limit was not provided during initialization, it will be calculated from the standard uncertainty and the distribution factor. The calculation is based on the assumption that the error follows the specified probability distribution. (error_limit = u × distribution_factor) (read-only).

`distribution` `property` ¶

Get the assumed probability distribution (read-only).

`distribution_factor` `property` ¶

Get the distribution factor (read-only).

`quality_indicator(tolerance)` ¶

Calculate the quality indicator Q.

Q serves as a quality indicator for the measurement process, reflecting how well the measurement system performs in relation to the specified requirements and tolerances.

\[ U = k * u \]

\[ Q_{MP} = \frac{2*U}{T} \]

`relative(measured_value)` ¶

Calculate the relative standard uncertainty as a percentage.

PARAMETER	DESCRIPTION
`measured_value`	The measured value to calculate relative uncertainty for. TYPE: `float`

RETURNS	DESCRIPTION
`float`	The relative uncertainty as a percentage.

RAISES	DESCRIPTION
`AssertionError`	If measured_value is zero.

`combine_with(*others, method='rss')` ¶

Combine this uncertainty with other uncertainties.

PARAMETER	DESCRIPTION
`*others`	Other uncertainty instances to combine with. TYPE: `MeasurementUncertainty \| float` DEFAULT: `()`
`method`	Combination method: - 'rss': Root sum of squares (for independent uncertainties) - 'linear': Linear addition (for fully correlated uncertainties) Default is 'rss'. TYPE: `(rss, linear)` DEFAULT: `'rss'`

RETURNS	DESCRIPTION
`MeasurementUncertainty`	A new instance with the combined uncertainty.

Examples:

u_1 = dsp.MeasurementUncertainty(standard=0.1)
u_2 = dsp.MeasurementUncertainty(error_limit=0.05)
u_3 = dsp.MeasurementUncertainty(expanded=0.2, k=2)

# Combine using root sum of squares (default)
combined_rss = u_1.combine_with(u_2, u_3)

# Combine using linear addition
combined_linear = u_1.combine_with(u_2, u_3, method='linear')

`summary()` ¶

Get a summary of uncertainty values.

RETURNS	DESCRIPTION
`Dict[str, float \| str]`	Dictionary containing various uncertainty representations.

`root_sum_squares(*args)` ¶

Calculate the root sum of squares of the given arguments.

PARAMETER	DESCRIPTION
`*args`	Values to be summed up TYPE: `float or int` DEFAULT: `()`

RETURNS	DESCRIPTION
`float`	The root sum of squares of the given arguments.

Notes

The root sum of squares is calculated as follows:

$$ \sqrt{x_1^2 + x_2^2 + ... + x_n^2}

$$

If only one argument is provided, it returns the argument itself.

RAISES	DESCRIPTION
`AssertionError`	If no arguments are provided or if any argument is not of type int or float.

`estimate_distribution(data, dists=DIST.COMMON)` ¶

First, the p-score is calculated by performing a Kolmogorov-Smirnov test to determine how well each distribution fits the data. Whatever has the highest P-score is considered the most accurate. This is because a higher p-score means the hypothesis is closest to reality.

PARAMETER	DESCRIPTION
`data`	1d array of data for which a distribution is to be searched TYPE: `NumericSample1D`
`dists`	Distributions to which the data may be subject. Only continuous distributions of scipy.stats are allowed, by default DIST.COMMON TYPE: `tuple of strings or rv_continous` DEFAULT: `COMMON`

RETURNS	DESCRIPTION
`dist`	A generic continous distribution class of best fit TYPE: `scipy.stats rv_continuous`
`p`	The two-tailed p-value for the best fit TYPE: `float`
`shape_params`	Estimates for any shape parameters (if applicable), followed by those for location and scale. For most random variables, shape statistics will be returned, but there are exceptions (e.g. norm). Can be used to generate values with the help of returned dist TYPE: `Tuple[float, ...]`

`estimate_kernel_density(data, *, stretch=1, height=None, base=0, n_points=DEFAULT.KD_SEQUENCE_LEN, margin=0.5)` ¶

Estimates the kernel density of data and returns values that are useful for a plot. If those values are plotted in combination with a histogram, set height as max value of the hostogram.

Kernel density estimation is a way to estimate the probability density function (PDF) of a random variable in a non-parametric way. The used gaussian_kde function of scipy.stats works for both uni-variate and multi-variate data. It includes automatic bandwidth determination. The estimation works best for a unimodal distribution; bimodal or multi-modal distributions tend to be oversmoothed.

PARAMETER	DESCRIPTION
`data`	1-D array of datapoints to estimate from. TYPE: `NumericSample1D`
`stretch`	Stretch the distribution estimate by the given factor, is only considered if "height" is None, by default 1 TYPE: `float` DEFAULT: `1`
`height`	If the KDE curve is plotted in combination with other data (e.g. a histogram), you can use height to specify the height at the maximum point of the KDE curve. If this value is specified, the area under the curve will not be normalized, by default None TYPE: `float or None` DEFAULT: `None`
`base`	The curve is shifted in the estimated direction by the given amount. This is usefull for violine plots, by default 0 TYPE: `float` DEFAULT: `0`
`n_points`	Number of points the estimation and sequence should have, by default KD_SEQUENCE_LEN (defined in constants.py) TYPE: `int` DEFAULT: `KD_SEQUENCE_LEN`
`margin`	Margin for the sequence as factor of data range (max - min ). If margin is 0, The two ends of the estimated density curve then show the minimum and maximum value. Default is 0. TYPE: `float` DEFAULT: `0.5`

RETURNS	DESCRIPTION
`sequence`	Data points at regular intervals from input data minimum to maximum TYPE: `1D array`
`estimation`	Data points of kernel density estimation TYPE: `1D array`

`estimate_kernel_density_2d(feature, target, *, n_points=DEFAULT.KD_SEQUENCE_LEN, margin=0.5)` ¶

Estimates the kernel density of 2 dimensional data and returns values that are useful for a contour plot.

Kernel density estimation is a way to estimate the probability density function (PDF) of a random variable in a non-parametric way. The used gaussian_kde function of scipy.stats works for both uni-variate and multi-variate data. It includes automatic bandwidth determination. The estimation works best for a unimodal distribution; bimodal or multi-modal distributions tend to be oversmoothed.

PARAMETER	DESCRIPTION
`feature`	A one-dimensional array-like object containing the exogenous samples. TYPE: `NumericSample1D`
`target`	A one-dimensional array-like object containing the endogenous samples. TYPE: `NumericSample1D`
`n_points`	Number of points the estimation and sequence should have, by default KD_SEQUENCE_LEN (defined in constants.py) TYPE: `int` DEFAULT: `KD_SEQUENCE_LEN`
`margin`	Margin for the sequence as factor of data range, by default 0.5. TYPE: `float` DEFAULT: `0.5`

RETURNS	DESCRIPTION
`feature_seq`	Data points at regular intervals from input data minimum to maximum used for feature data TYPE: `2D array`
`target_seq`	Data points at regular intervals from input data minimum to maximum used for target data TYPE: `2D array`
`estimation`	Data points of kernel density estimation TYPE: `2D array`

RAISES	DESCRIPTION
`AssertionError:`	If the provided data is empty, contains only zeros or all values are identical.

`estimate_capability_confidence(process, *, kind='cpk', level=0.95, n_groups=1)` ¶

Calculates the confidence interval for the process capability index (Cp or Cpk) of a process.

This function is an extension of the cp_ci and cpk_ci functions. It instantiates a ProcessEstimator and then determines the confidence intervals using the Cp or Cpk values from the estimator.

PARAMETER	DESCRIPTION
`process`	Process Estimator instance, is required to get the necessary process information such as capability indices and number of samples. TYPE: `ProcessEstimator`
`kind`	Specifies whether to calculate the confidence interval for Cp or Cpk ('cp' or 'cpk'). Defaults is 'cpk'. TYPE: `Literal['cp', 'cpk]` DEFAULT: `'cpk'`
`level`	The desired confidence level for the interval, expressed as a decimal. Default is 0.95 (95% confidence). TYPE: `float` DEFAULT: `0.95`
`n_groups`	The number of groups for Bonferroni correction to adjust for multiple comparisons. Default is 1, indicating no correction TYPE: `int` DEFAULT: `1`

RETURNS	DESCRIPTION
`Tuple[float, float, float]:`	A tuple containing the estimate, lower bound, and upper bound of the confidence interval for the specified process capability index.

RAISES	DESCRIPTION
`AssertionError:`	If provided kind is not 'cp' or 'cpk'.
`ValueError:`	If no limit is provided or if only one limit is provided and kind is set to 'cp'.

`estimate_resolution(data)` ¶

Estimate the resolution based on the length of the samples digits.

PARAMETER	DESCRIPTION
`data`	1-D array of datapoints to estimate from. TYPE: `NumericSample1D`

RETURNS	DESCRIPTION
`float`	The estimated resolution.

Index

daspi.statistics.estimation ¶

MeasurementUncertainty(*, standard=None, expanded=None, error_limit=None, distribution_factor=None, k=2, confidence_level=None, distribution='rectangular') ¶

standard property ¶

confidence_level property ¶

k property ¶

expanded property ¶

error_limit property ¶

distribution property ¶

distribution_factor property ¶

quality_indicator(tolerance) ¶

relative(measured_value) ¶

combine_with(*others, method='rss') ¶

summary() ¶

root_sum_squares(*args) ¶

estimate_distribution(data, dists=DIST.COMMON) ¶

estimate_kernel_density(data, *, stretch=1, height=None, base=0, n_points=DEFAULT.KD_SEQUENCE_LEN, margin=0.5) ¶

estimate_kernel_density_2d(feature, target, *, n_points=DEFAULT.KD_SEQUENCE_LEN, margin=0.5) ¶

estimate_capability_confidence(process, *, kind='cpk', level=0.95, n_groups=1) ¶

estimate_resolution(data) ¶

`daspi.statistics.estimation` ¶

`MeasurementUncertainty(*, standard=None, expanded=None, error_limit=None, distribution_factor=None, k=2, confidence_level=None, distribution='rectangular')` ¶

`standard` `property` ¶

`confidence_level` `property` ¶

`k` `property` ¶

`expanded` `property` ¶

`error_limit` `property` ¶

`distribution` `property` ¶

`distribution_factor` `property` ¶

`quality_indicator(tolerance)` ¶

`relative(measured_value)` ¶

`combine_with(*others, method='rss')` ¶

`summary()` ¶

`root_sum_squares(*args)` ¶

`estimate_distribution(data, dists=DIST.COMMON)` ¶

`estimate_kernel_density(data, *, stretch=1, height=None, base=0, n_points=DEFAULT.KD_SEQUENCE_LEN, margin=0.5)` ¶

`estimate_kernel_density_2d(feature, target, *, n_points=DEFAULT.KD_SEQUENCE_LEN, margin=0.5)` ¶

`estimate_capability_confidence(process, *, kind='cpk', level=0.95, n_groups=1)` ¶

`estimate_resolution(data)` ¶