Skip to content

Capability confidence interval

daspi.plotlib.plotter.CapabilityConfidenceInterval(source, target, spec_limits, kind, n_groups=1, feature='', show_center=True, bars_same_color=False, skip_na=None, target_on_y=True, confidence_level=0.95, show_feature_axis=None, color=None, marker=None, ax=None, visible_spines=None, hide_axis=None, kw_estim={}, **kwds)

Bases: ConfidenceInterval

Class for creating plotters with error bars as confidence interval for the process capability values Cp or Cpk.

PARAMETER DESCRIPTION
source

Pandas long format DataFrame containing the data source for the plot.

TYPE: pandas DataFrame

target

Column name of the target variable for the plot.

TYPE: str

spec_limits

Specification limits for the target variable. This can be created using the SpecLimits class.

TYPE: SpecLimits

kind

The capability index to be calculated. Cp can be used to compare the process variability to a specification width, while Cpk also considers the process mean. The Cp can only be calculated if both specification limits are given.

TYPE: (cp, cpk) DEFAULT: 'cp'

n_groups

Number of groups (variable combinations) for the Bonferroni adjustment. A good way to do this is to pass df.groupby(list_of_variates).ngroups, where list_of_variates is a list containing all the categorical columns in the source that will be used for the chart to split the data into groups (hue, categorical features, etc.). Specify 1 to not do a Bonferroni adjustment. Default is 1

TYPE: int DEFAULT: 1

feature

Column name of the feature variable for the plot, by default ''.

TYPE: str DEFAULT: ''

show_center

Flag indicating whether to show the center points, by default True.

TYPE: bool DEFAULT: True

bars_same_color

Flag indicating whether to use same color for error bars as markers for center. If False, the error bars are black, by default False

TYPE: bool DEFAULT: False

skip_na

Flag indicating whether to skip missing values in the feature grouped data, by default None - None, no missing values are skipped - all', grouped data is skipped if all values are missing - any', grouped data is skipped if any value is missing

TYPE: Literal['none', 'all', 'any'] DEFAULT: None

target_on_y

Flag indicating whether the target variable is plotted on the y-axis, by default True.

TYPE: bool DEFAULT: True

confidence_level

Confidence level for the confidence intervals, by default 0.95.

TYPE: float DEFAULT: 0.95

color

Color to be used to draw the artists. If None, the first color is taken from the color cycle, by default None.

TYPE: str | None DEFAULT: None

marker

The marker style for the center points. Available markers see: https://matplotlib.org/stable/api/markers_api.html, by default None

TYPE: str | None DEFAULT: None

ax

The axes object for the plot. If None, the current axes is fetched using plt.gca(). If no axes are available, a new one is created. Defaults to None.

TYPE: Axes | None DEFAULT: None

visible_spines

Specifies which spines are visible, the others are hidden. If 'none', no spines are visible. If None, the spines are drawn according to the stylesheet. Defaults to None.

TYPE: Literal['target', 'feature', 'none'] | None DEFAULT: None

hide_axis

Specifies which axes should be hidden. If None, both axes are displayed. Defaults to None.

TYPE: Literal['target', 'feature', 'both'] | None DEFAULT: None

kw_estim

Additional keyword arguments that are passed to the ProcessEstimator class. Possible keword arguments are: - error_values: Tuple[float, ...] = (), - strategy: Literal['eval', 'fit', 'norm', 'data'] = 'norm', - agreement: float | int = 6, - possible_dists: Tuple[str | rv_continuous, ...] = DIST.COMMON

TYPE: Dict[str, Any] DEFAULT: {}

**kwds

Additional keyword arguments that have no effect and are only used to catch further arguments that have no use here (occurs when this class is used within chart objects).

DEFAULT: {}

Examples:

Apply to an existing Axes object:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from daspi import CapabilityConfidenceInterval, SpecLimits

fig, ax = plt.subplots()
df = pd.DataFrame(dict(
    x = ['first'] * 100 + ['second'] * 100 + ['third'] * 100,
    y = (list(np.random.normal(loc=3, scale=1, size=100))
        + list(np.random.normal(loc=4, scale=1, size=100))
        + list(np.random.normal(loc=2, scale=1, size=100)))))

test = CapabilityConfidenceInterval(
    source=df, target='y', feature='x', spec_limits=SpecLimits(upper=4.3), 
    kind='cpk', show_center=True, n_groups=df.x.nunique(), 
    confidence_level=0.95, bars_same_color=True, ax=ax)
test(kw_center=dict(s=30, marker='_'))
test.label_feature_ticks()

#If you are interested in the calculated values, you can get them like this:
print(test.source)

Apply using the plot method of a DaSPi Chart object:

import numpy as np
import daspi as dsp
import pandas as pd

df = pd.DataFrame(dict(
    x = ['first'] * 100 + ['second'] * 100 + ['third'] * 100,
    y = (list(np.random.normal(loc=3, scale=1, size=100))
        + list(np.random.normal(loc=4, scale=1, size=100))
        + list(np.random.normal(loc=2, scale=1, size=100)))))

chart = dsp.SingleChart(
        source=df,
        target='y',
        feature='x',
        categorical_feature=True, # neded to label the feature tick labels
    ).plot(
        dsp.CapabilityConfidenceInterval,
        kind='cpk',
        spec_limits=dsp.SpecLimits(upper=4.3),
        show_center=True, 
        confidence_level=0.95,
        n_groups=df.x.nunique(),
        bars_same_color=True,
    ).label() # neded to label the feature tick labels

#If you are interested in the calculated values, you can get them like this:
df_cpk = chart.plots[0].source.copy()
df_cpk.index = chart.dodging.pos_to_ticklabels(df_cpk['x'])
print(df_cpk)

processes = {} instance-attribute

ProcessEstimator classes used to calculate the cp and cpk values. One for each feature level. - key: feature level as str - value: ProcessEstimator instance

spec_limits = spec_limits instance-attribute

Spec limits used for calculating the capability values.

kind = kind instance-attribute

whether to calculate the confidence interval for Cp or Cpk ('cp' or 'cpk').

kw_estim = kw_estim instance-attribute

Additional keyword arguments that are passed to the ProcessEstimator classes.

hide_feature_axis()

Hide the density axis (spine, ticks and labels).

transform(feature_data, target_data)

Perform the transformation on the target data by using the given function `ci_func' and return the transformed data.

PARAMETER DESCRIPTION
feature_data

Base location (offset) of feature axis coming from feature_grouped generator.

TYPE: float | int

target_data

Feature grouped target data used for transformation, coming from feature_grouped generator.

TYPE: pandas Series

RETURNS DESCRIPTION
data

The transformed data source for the plot.

TYPE: pandas DataFrame