Location dispersion estimator
daspi.statistics.estimation.LocationDispersionEstimator(samples, strategy='norm', agreement=6, possible_dists=DIST.COMMON, evaluate=None, nan_policy='omit')
¶
Bases: DistributionEstimator
An object for various statistical estimators
The attributes are calculated lazily. After the class is instantiated, all attributes are set to None. As soon as an attribute (actually Property) is called, the value is calculated and stored so that the calculation is only performed once
| PARAMETER | DESCRIPTION |
|---|---|
samples
|
sample data
TYPE:
|
strategy
|
Which strategy should be used to determine the control
limits (process spread):
- Default is 'norm'.
TYPE:
|
agreement
|
Specify the tolerated process variation for which the control limits are to be calculated. - If int, the spread is determined using the normal distribution agreementσ, e.g. agreement = 6 -> 6σ ~ covers 99.75 % of the data. The upper and lower permissible quantiles are then calculated from this. - If float, the value must be between 0 and 1.This value is then interpreted as the acceptable proportion for the spread, e.g. 0.9973 (which corresponds to ~ 6 σ) Default is 6 because SixSigma ;-)
TYPE:
|
possible_dists
|
Distributions to which the data may be subject. Only
continuous distributions of scipy.stats are allowed,
by default
TYPE:
|
evaluate
|
Provide a function that evaluates the
TYPE:
|
nan_policy
|
How to handle NaN values in the samples. - 'propagate': NaN values are preserved in the analysis. - 'raise': Raises an error if NaN values are found. - 'omit': Omits NaN values from the analysis, default is 'omit'.
TYPE:
|
Examples:
import numpy as np
import daspi as dsp
np.random.seed(1)
samples = data = np.random.weibull(a=1.5, size=100)
estimation = dsp.LocationDispersionEstimator(
samples=samples,
strategy='fit',
agreement=6,
possible_dists=dsp.DIST.COMMON_NOT_NORM)
print(estimation.describe())
So you will receive the following output:
None
n_samples 100
n_missing 0
min 0.002356
max 2.724595
R 2.722239
mean 0.86943
median 0.74006
std 0.593666
sem 0.059367
dist weibull_min
p_ks 0.968613
p_ad 0.000455
excess 0.163836
p_excess 0.599041
skew 0.802202
p_skew 0.001918
strategy fit
lcl -0.004917
ucl 3.43952
Notes
A special case occurs when the agreement is 1. For a corresponding
standard deviation, enter 1 as an integer. If you want percentiles
or the entire range, enter it as a floating-point number (1.0) or as
float('inf'). If strategy is 'data', lcl and ucl correspond to
min and max, otherwise we get -inf and inf.
| RAISES | DESCRIPTION |
|---|---|
ValueError
|
If NaN values are found in the samples and |
UserWarning
|
If NaN values are found in the samples and |
dof
property
¶
Get degree of freedom for filtered samples (read-only).
min
property
¶
Get the minimum value of filtered samples (read-only).
max
property
¶
Get the maximum value of filtered samples (read-only).
R
property
¶
Get range of filtered samples (read-only).
mean
property
¶
Get mean of filtered samples (read-only).
median
property
¶
Get median of filtered samples (read-only).
std
property
¶
Get standard deviation of filtered samples (read-only).
sem
property
¶
Get standard error mean of filtered samples (read-only).
lcl
property
¶
Get lower control limit according to given strategy and agreement (read-only).
ucl
property
¶
Get upper control limit according to given strategy and agreement (read-only).
q_low
property
¶
Get quantil for lower control limit according to given agreement. If the samples is subject to normal distribution and the agreement is given as 6, this value corresponds to the 0.135 % quantile: 6 σ ~ 99.73 % of the samples (read-only).
q_upp
property
¶
Get quantil for upper control limit according to given agreement. If the sample data is subject to normal distribution and the agreement is given as 6, this value corresponds to the Q_0.99865: 0.99865-quantile or 99.865-percentile (read-only).
strategy
property
writable
¶
Strategy used to determine the control limits. The control limits can also be interpreted as the process range.
Set strategy as one of {'eval', 'fit', 'norm', 'data'} - eval: If no evaluate function is given, the strategy is determined according to the internal evaluate method. - fit: First, the distribution is searched for that best represents the process data and then the process variation tolerance is calculated - norm: it is assumed that the data is subject to normal distribution. The variation tolerance is then calculated as agreement * standard deviation - data: The quantiles for the process variation tolerance are read directly from the samples.
agreement
property
writable
¶
Get the agreement multiplier for the σ (standard deviation) used in calculating Cp and Cpk values.
The agreement is defined as twice the coverage factor k.
Setting this value will reset the Cp and Cpk values to None,
reflecting that the underlying uncertainty parameters have
changed.
When setting the agreement using a percentile, provide the acceptable proportion for the spread, such as 0.9973, which corresponds to approximately 6σ (six standard deviations). The agreement value must be specified as either: - A percentage (0.0 < agreement <= 1.0) indicating the acceptable proportion for the spread. - A multiple of the standard deviation (agreement >= 1).
Special Case:
- If the agreement is set to 1 (indicating a standard deviation
multiplier), enter it as an integer (1).
- For percentiles or a broader range, use a floating-point
representation (e.g., 1.0) or float('inf') for an infinite
range.
| RAISES | DESCRIPTION |
|---|---|
AssertionError
|
If the provided agreement value is not in the valid range (0.0 < agreement <= 1.0 for percentiles or agreement >= 1 for standard deviation multipliers). |
k
property
¶
Get the coverage factor k used in uncertainty
calculations (read-only).
This property returns the coverage factor, which is a multiplier
used to determine the expanded uncertainty based on the standard
uncertainty. The value of k is typically set to reflect the
desired confidence level in the measurement results.
z_transform(x)
¶
Transform value to z-score.
This method produces a value from a distribution with a mean of 0 and a standard deviation of 1. The value indicates how many standard deviations the value is from the mean.
| PARAMETER | DESCRIPTION |
|---|---|
x
|
value to be transformed
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
z
|
z-score
TYPE:
|
mean_ci(level=0.95)
¶
Two sided confidence interval for mean of filtered data
| PARAMETER | DESCRIPTION |
|---|---|
level
|
confidence level, by default 0.95
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
ci_low, ci_upp : float
|
lower and upper confidence level |
median_ci(level=0.95)
¶
Two sided confidence interval for median of filtered data
| PARAMETER | DESCRIPTION |
|---|---|
level
|
confidence level, by default 0.95
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
ci_low, ci_upp : float
|
lower and upper confidence level |
stdev_ci(level=0.95)
¶
Two sided confidence interval for standard deviation of filtered data
| PARAMETER | DESCRIPTION |
|---|---|
level
|
confidence level, by default 0.95
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
ci_low, ci_upp : float
|
lower and upper confidence level |
evaluate()
¶
Evaluate strategy to calculate control limits. If no evaluate function is given the strategy is evaluated as follows:
- If variance is not stable within the samples -> strategy = 'data'
- If variance and mean is stable and samples follow a normal curve -> strategy = 'norm'
- If variance and mean is stable but samples don't follow a normal curve -> strategy = 'fit'
- If variance is stable but mean not and samples follow a normal curve -> strategy = 'norm'
- If variance is stable but mean not and samples don't follow a normal curve -> strategy = 'data'
| RETURNS | DESCRIPTION |
|---|---|
strategy
|
Evaluated strategy to calculate control limits
TYPE:
|