Skip to content

Probability

daspi.plotlib.plotter.Probability(source, target, dist='norm', kind='sq', target_on_y=True, color=None, marker=None, show_scatter=True, show_fit_ci=True, show_pred_ci=True, ax=None, visible_spines=None, hide_axis=None, **kwds)

Bases: LinearRegressionLine

A Q-Q and P-P probability plotter that extends the LinearRegressionLine class.

PARAMETER DESCRIPTION
source

Pandas long format DataFrame containing the data source for the plot.

TYPE: pandas DataFrame

target

Column name of the target variable for the plot.

TYPE: str

dist

The probability distribution use for creating feature data (the theoretical values).

TYPE: scipy stats rv_continuous DEFAULT: 'norm'

kind

The type of probability plot to create. The first letter corresponds to the target, the second to the feature. Defaults to 'sq': - qq: target = sample quantile; feature = theoretical quantile - pp: target = sample percentile; feature = theoretical percentile - sq: target = sample data; feature = theoretical quantiles - sp: target = sample data, feature = theoretical percentiles

TYPE: Literal['qq', 'pp', 'sq', 'sp'] DEFAULT: 'sq'

target_on_y

Flag indicating whether the target variable is plotted on the y-axis, by default True.

TYPE: bool DEFAULT: True

color

Color to be used to draw the artists. If None, the first color is taken from the color cycle, by default None.

TYPE: str | None DEFAULT: None

marker

The marker style for the scatter plot. Available markers see: https://matplotlib.org/stable/api/markers_api.html, by default None

TYPE: str | None DEFAULT: None

show_scatter

Flag indicating whether to show the individual points, by default True.

TYPE: bool DEFAULT: True

show_fit_ci

Flag indicating whether to show the confidence interval for the fitted line as filled area, by default False.

TYPE: bool DEFAULT: True

show_pred_ci

Flag indicating whether to show the confidence interval for predictions as additional lines, by default False.

TYPE: bool DEFAULT: True

ax

The axes object for the plot. If None, the current axes is fetched using plt.gca(). If no axes are available, a new one is created. Defaults to None.

TYPE: Axes | None DEFAULT: None

visible_spines

Specifies which spines are visible, the others are hidden. If 'none', no spines are visible. If None, the spines are drawn according to the stylesheet. Defaults to None.

TYPE: Literal['target', 'feature', 'none'] | None DEFAULT: None

hide_axis

Specifies which axes should be hidden. If None, both axes are displayed. Defaults to None.

TYPE: Literal['target', 'feature', 'both'] | None DEFAULT: None

**kwds

Those arguments have no effect. Only serves to catch further arguments that have no use here (occurs when this class is used within chart objects).

DEFAULT: {}

Examples:

Apply to an existing Axes object:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from daspi import Probability

fig, ax = plt.subplots()
df = pd.DataFrame(dict(
    y = np.random.weibull(a=1, size=100)))
prob_line = Probability(
    source=df, target='y', kind='pp', show_scatter=True, show_fit_ci=True,
    ax=ax)
prob_line(
    kw_scatter=dict(color='black', s=10, alpha=0.5),
    kw_fit_ci=dict(color='skyblue'),
    color='deepskyblue')

Apply using the plot method of a DaSPi Chart object:

import numpy as np
import daspi as dsp
import pandas as pd

df = pd.DataFrame(dict(
    y = np.random.weibull(a=1, size=100)))
chart = dsp.SingleChart(
        source=df,
        target='y',
        feature='x'
    ).plot(
        dsp.Probability,
        kind='pp',
        show_scatter=True,
        show_fit_ci=True,
        kw_call=dict(
            kw_scatter=dict(color='black', s=10, alpha=0.5),
            kw_fit_ci=dict(color='skyblue'),
            color='deepskyblue')
    )
RAISES DESCRIPTION
AssertionError

If given kind is not one of 'qq', 'pp', 'sq' or 'sp'

kind = kind instance-attribute

The type of probability plot to create.

dist = DistributionEstimator(source[target], dist) instance-attribute

The distribution estimator used for creating feature data.

sample_data property

Get fitted samples (target data) according to given kind

theoretical_data property

Get theoretical data (quantiles or percentiles) according to the given kind.

format_axis()

Format the x-axis and y-axis based on the probability plot type.