Location dispersion estimator

`daspi.statistics.estimation.LocationDispersionEstimator(samples, strategy='norm', agreement=6, possible_dists=DIST.COMMON, evaluate=None, nan_policy='omit')` ¶

Bases: DistributionEstimator

An object for various statistical estimators

The attributes are calculated lazily. After the class is instantiated, all attributes are set to None. As soon as an attribute (actually Property) is called, the value is calculated and stored so that the calculation is only performed once

PARAMETER	DESCRIPTION
`samples`	sample data TYPE: `NumericSample1D`
`strategy`	Which strategy should be used to determine the control limits (process spread): - `eval`: The strategy is determined according to the given evaluate function. If none is given, the internal `evaluate` method is used. - `fit`: First, the distribution that best represents the process data is searched for and then the agreed process spread is calculated - `norm`: it is assumed that the data is subject to normal distribution. The variation tolerance is then calculated as agreement * standard deviation - `data`: The quantiles for the process variation tolerance are read directly from the data. Default is 'norm'. TYPE: `(eval, fit, norm, data)` DEFAULT: `'eval'`
`agreement`	Specify the tolerated process variation for which the control limits are to be calculated. - If int, the spread is determined using the normal distribution agreementσ, e.g. agreement = 6 -> 6σ ~ covers 99.75 % of the data. The upper and lower permissible quantiles are then calculated from this. - If float, the value must be between 0 and 1.This value is then interpreted as the acceptable proportion for the spread, e.g. 0.9973 (which corresponds to ~ 6 σ) Default is 6 because SixSigma ;-) TYPE: `int or float` DEFAULT: `6`
`possible_dists`	Distributions to which the data may be subject. Only continuous distributions of scipy.stats are allowed, by default `DIST.COMMON` TYPE: `tuple of strings or rv_continous` DEFAULT: `COMMON`
`evaluate`	Provide a function that evaluates the `strategy`. If `strategy` is set to 'eval', this function is used to determine the strategy. The function should take the `samples` as input and return a string that corresponds to a valid strategy 'fit', 'norm', or 'data'. If not provided, the internal `evaluate` method is used. For more information on the `evaluate` method, see the class documentation. Default is None. TYPE: `Callable \| None` DEFAULT: `None`
`nan_policy`	How to handle NaN values in the samples. - 'propagate': NaN values are preserved in the analysis. - 'raise': Raises an error if NaN values are found. - 'omit': Omits NaN values from the analysis, default is 'omit'. TYPE: `(propagate, 'raise', omit)` DEFAULT: `'propagate'`

Examples:

import numpy as np
import daspi as dsp

np.random.seed(1)
samples = data = np.random.weibull(a=1.5, size=100)
estimation = dsp.LocationDispersionEstimator(
    samples=samples,
    strategy='fit',
    agreement=6,
    possible_dists=dsp.DIST.COMMON_NOT_NORM)
print(estimation.describe())

So you will receive the following output:

                  None
n_samples          100
n_missing            0
min           0.002356
max           2.724595
R             2.722239
mean           0.86943
median         0.74006
std           0.593666
sem           0.059367
dist       weibull_min
p_ks          0.968613
p_ad          0.000455
excess        0.163836
p_excess      0.599041
skew          0.802202
p_skew        0.001918
strategy           fit
lcl          -0.004917
ucl            3.43952

Notes

A special case occurs when the agreement is 1. For a corresponding standard deviation, enter 1 as an integer. If you want percentiles or the entire range, enter it as a floating-point number (1.0) or as float('inf'). If strategy is 'data', lcl and ucl correspond to min and max, otherwise we get -inf and inf.

RAISES	DESCRIPTION
`ValueError`	If NaN values are found in the samples and `nan_policy` is set to 'raise'.
`UserWarning`	If NaN values are found in the samples and `nan_policy` is set to 'omit' or 'propagate'. The warning indicates that NaN values will be omitted from the analysis or may lead to unexpected results.

`dof` `property` ¶

Get degree of freedom for filtered samples (read-only).

`min` `property` ¶

Get the minimum value of filtered samples (read-only).

`max` `property` ¶

Get the maximum value of filtered samples (read-only).

`R` `property` ¶

Get range of filtered samples (read-only).

`mean` `property` ¶

Get mean of filtered samples (read-only).

`median` `property` ¶

Get median of filtered samples (read-only).

`std` `property` ¶

Get standard deviation of filtered samples (read-only).

`sem` `property` ¶

Get standard error mean of filtered samples (read-only).

`lcl` `property` ¶

Get lower control limit according to given strategy and agreement (read-only).

`ucl` `property` ¶

Get upper control limit according to given strategy and agreement (read-only).

`q_low` `property` ¶

Get quantil for lower control limit according to given agreement. If the samples is subject to normal distribution and the agreement is given as 6, this value corresponds to the 0.135 % quantile: 6 σ ~ 99.73 % of the samples (read-only).

`q_upp` `property` ¶

Get quantil for upper control limit according to given agreement. If the sample data is subject to normal distribution and the agreement is given as 6, this value corresponds to the Q_0.99865: 0.99865-quantile or 99.865-percentile (read-only).

`strategy` `property` `writable` ¶

Strategy used to determine the control limits. The control limits can also be interpreted as the process range.

Set strategy as one of {'eval', 'fit', 'norm', 'data'} - eval: If no evaluate function is given, the strategy is determined according to the internal evaluate method. - fit: First, the distribution is searched for that best represents the process data and then the process variation tolerance is calculated - norm: it is assumed that the data is subject to normal distribution. The variation tolerance is then calculated as agreement * standard deviation - data: The quantiles for the process variation tolerance are read directly from the samples.

`agreement` `property` `writable` ¶

Get the agreement multiplier for the σ (standard deviation) used in calculating Cp and Cpk values.

The agreement is defined as twice the coverage factor k. Setting this value will reset the Cp and Cpk values to None, reflecting that the underlying uncertainty parameters have changed.

When setting the agreement using a percentile, provide the acceptable proportion for the spread, such as 0.9973, which corresponds to approximately 6σ (six standard deviations). The agreement value must be specified as either: - A percentage (0.0 < agreement <= 1.0) indicating the acceptable proportion for the spread. - A multiple of the standard deviation (agreement >= 1).

Special Case: - If the agreement is set to 1 (indicating a standard deviation multiplier), enter it as an integer (1). - For percentiles or a broader range, use a floating-point representation (e.g., 1.0) or float('inf') for an infinite range.

RAISES	DESCRIPTION
`AssertionError`	If the provided agreement value is not in the valid range (0.0 < agreement <= 1.0 for percentiles or agreement >= 1 for standard deviation multipliers).

`k` `property` ¶

Get the coverage factor k used in uncertainty calculations (read-only).

This property returns the coverage factor, which is a multiplier used to determine the expanded uncertainty based on the standard uncertainty. The value of k is typically set to reflect the desired confidence level in the measurement results.

`z_transform(x)` ¶

Transform value to z-score.

This method produces a value from a distribution with a mean of 0 and a standard deviation of 1. The value indicates how many standard deviations the value is from the mean.

PARAMETER	DESCRIPTION
`x`	value to be transformed TYPE: `float`

RETURNS	DESCRIPTION
`z`	z-score TYPE: `float`

`mean_ci(level=0.95)` ¶

Two sided confidence interval for mean of filtered data

PARAMETER	DESCRIPTION
`level`	confidence level, by default 0.95 TYPE: `float in (0, 1)` DEFAULT: `0.95`

RETURNS	DESCRIPTION
`ci_low, ci_upp : float`	lower and upper confidence level

`median_ci(level=0.95)` ¶

Two sided confidence interval for median of filtered data

PARAMETER	DESCRIPTION
`level`	confidence level, by default 0.95 TYPE: `float in (0, 1)` DEFAULT: `0.95`

RETURNS	DESCRIPTION
`ci_low, ci_upp : float`	lower and upper confidence level

`stdev_ci(level=0.95)` ¶

Two sided confidence interval for standard deviation of filtered data

PARAMETER	DESCRIPTION
`level`	confidence level, by default 0.95 TYPE: `float in (0, 1)` DEFAULT: `0.95`

RETURNS	DESCRIPTION
`ci_low, ci_upp : float`	lower and upper confidence level

`evaluate()` ¶

Evaluate strategy to calculate control limits. If no evaluate function is given the strategy is evaluated as follows:

If variance is not stable within the samples -> strategy = 'data'
If variance and mean is stable and samples follow a normal curve -> strategy = 'norm'
If variance and mean is stable but samples don't follow a normal curve -> strategy = 'fit'
If variance is stable but mean not and samples follow a normal curve -> strategy = 'norm'
If variance is stable but mean not and samples don't follow a normal curve -> strategy = 'data'

RETURNS	DESCRIPTION
`strategy`	Evaluated strategy to calculate control limits TYPE: `{fit, norm, data}`

Location dispersion estimator

daspi.statistics.estimation.LocationDispersionEstimator(samples, strategy='norm', agreement=6, possible_dists=DIST.COMMON, evaluate=None, nan_policy='omit') ¶

dof property ¶

min property ¶

max property ¶

R property ¶

mean property ¶

median property ¶

std property ¶

sem property ¶

lcl property ¶

ucl property ¶

q_low property ¶

q_upp property ¶

strategy property writable ¶

agreement property writable ¶

k property ¶

z_transform(x) ¶

mean_ci(level=0.95) ¶

median_ci(level=0.95) ¶

stdev_ci(level=0.95) ¶

evaluate() ¶