Skip to content

Confidence interval

daspi.plotlib.plotter.ConfidenceInterval(source, target, n_groups=1, feature='', show_center=True, bars_same_color=False, skip_na=None, target_on_y=True, confidence_level=0.95, ci_func=mean_ci, color=None, marker=None, ax=None, visible_spines=None, hide_axis=None, **kwds)

Bases: Errorbar

Class for creating plotters with error bars representing optical distinction tests.

This class is useful for visually testing whether there is a statistically significant difference between groups or conditions. By plotting confidence intervals around the actual value, it provides a visual representation of the uncertainty in the estimate and allows for a quick assessment of whether the intervals overlap or not.

PARAMETER DESCRIPTION
source

Pandas long format DataFrame containing the data source for the plot.

TYPE: pandas DataFrame

target

Column name of the target variable for the plot.

TYPE: str

n_groups

Number of groups (variable combinations) for the Bonferroni adjustment. A good way to do this is to pass df.groupby(list_of_variates).ngroups, where list_of_variates is a list containing all the categorical columns in the source that will be used for the chart to split the data into groups (hue, categorical features, etc.). Specify 1 to not do a Bonferroni adjustment. Default is 1

TYPE: int DEFAULT: 1

feature

Column name of the feature variable for the plot, by default ''.

TYPE: str DEFAULT: ''

show_center

Flag indicating whether to show the center points, by default True.

TYPE: bool DEFAULT: True

bars_same_color

Flag indicating whether to use same color for error bars as markers for center. If False, the error bars are black, by default False

TYPE: bool DEFAULT: False

skip_na

Flag indicating whether to skip missing values in the feature grouped data, by default None - None, no missing values are skipped - all', grouped data is skipped if all values are missing - any', grouped data is skipped if any value is missing

TYPE: Literal['none', 'all', 'any'] DEFAULT: None

target_on_y

Flag indicating whether the target variable is plotted on the y-axis, by default True.

TYPE: bool DEFAULT: True

confidence_level

Confidence level for the confidence intervals, by default 0.95.

TYPE: float DEFAULT: 0.95

ci_func

Function for calculating the confidence intervals. The following two arguments are passed to the function: The sample data and the confidence level. The returned values must be three floats in order: center value, lower confidence limit and upper confidence limit. Default is daspi.statistics.conficence.mean_ci.

TYPE: Callable DEFAULT: mean_ci

color

Color to be used to draw the artists. If None, the first color is taken from the color cycle, by default None.

TYPE: str | None DEFAULT: None

marker

The marker style for the center points. Available markers see: https://matplotlib.org/stable/api/markers_api.html, by default None

TYPE: str | None DEFAULT: None

ax

The axes object for the plot. If None, the current axes is fetched using plt.gca(). If no axes are available, a new one is created. Defaults to None.

TYPE: Axes | None DEFAULT: None

visible_spines

Specifies which spines are visible, the others are hidden. If 'none', no spines are visible. If None, the spines are drawn according to the stylesheet. Defaults to None.

TYPE: Literal['target', 'feature', 'none'] | None DEFAULT: None

hide_axis

Specifies which axes should be hidden. If None, both axes are displayed. Defaults to None.

TYPE: Literal['target', 'feature', 'both'] | None DEFAULT: None

**kwds

Additional keyword arguments that have no effect and are only used to catch further arguments that have no use here (occurs when this class is used within chart objects).

DEFAULT: {}

Examples:

Apply to an existing Axes object:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from daspi import ConfidenceInterval, variance_ci

fig, ax = plt.subplots()
df = pd.DataFrame(dict(
    x = ['first'] * 50 + ['second'] * 50 + ['third'] * 50,
    y = (
        list(np.random.normal(loc=3, scale=1, size=50))
        + list(np.random.normal(loc=4, scale=1, size=50))
        + list(np.random.normal(loc=2, scale=1, size=50)))))
ci = ConfidenceInterval(
    source=df, target='y', feature='x', show_center=True, ci_func=variance_ci,
    n_groups=df.x.nunique(), confidence_level=0.95, bars_same_color=True,
    ax=ax)
ci(kw_center=dict(s=30, marker='_'))
ci.label_feature_ticks()

Apply using the plot method of a DaSPi Chart object:

import numpy as np
import daspi as dsp
import pandas as pd

df = pd.DataFrame(dict(
    x = ['first'] * 50 + ['second'] * 50 + ['third'] * 50,
    y = (
        list(np.random.normal(loc=1, scale=3, size=50))
        + list(np.random.normal(loc=1, scale=4, size=50))
        + list(np.random.normal(loc=1, scale=2, size=50)))))
chart = dsp.SingleChart(
        source=df,
        target='y',
        feature='x',
        categorical_feature=True, # neded to label the feature tick labels
    ).plot(
        dsp.ConfidenceInterval,
        show_center=True,
        ci_func=dsp.variance_ci,
        n_groups=df.x.nunique(),
        confidence_level=0.95,
        bars_same_color=True,
        kw_call=dict(kw_center=dict(s=30, marker='_'))
    ).label() # neded to label the feature tick labels

confidence_level = confidence_level instance-attribute

Confidence level for the confidence intervals.

ci_func = ci_func instance-attribute

Provided function for calculating the confidence intervals.

n_groups = n_groups instance-attribute

Number of unique feature values.

transform(feature_data, target_data)

Perform the transformation on the target data by using the given function `ci_func' and return the transformed data.

PARAMETER DESCRIPTION
feature_data

Base location (offset) of feature axis coming from feature_grouped generator.

TYPE: float | int

target_data

Feature grouped target data used for transformation, coming from feature_grouped generator.

TYPE: pandas Series

RETURNS DESCRIPTION
data

The transformed data source for the plot.

TYPE: pandas DataFrame