Pareto

`daspi.plotlib.plotter.Pareto(source, target, feature, highlight=None, highlight_color=COLOR.BAD, highlighted_as_last=True, no_percentage_line=False, width=CATEGORY.FEATURE_SPACE, method=None, kw_method={}, skip_na=None, target_on_y=True, color=None, ax=None, visible_spines=None, hide_axis=None, **kwds)` ¶

Bases: Bar

A plotter to perform a pareto chart that extends the Bar plotter class.

A Pareto chart is a type of chart that combines a bar graph and a line graph. It is used to display and analyze data in order to prioritize and identify the most significant factors contributing to a particular phenomenon or problem. The line graph in a Pareto chart shows the cumulative percentage of the total, which helps identify the point at which a significant portion of the cumulative total is reached.

Pareto charts are commonly used in quality control, process improvement, and decision-making processes. They allow users to visually identify and focus on the most significant factors that contribute to a problem or outcome, enabling them to allocate resources and address the most critical issues first.

PARAMETER	DESCRIPTION
`source`	Pandas long format DataFrame containing the data source for the plot. TYPE: `pandas DataFrame`
`target`	Column name of the target variable for the plot. TYPE: `str`
`feature`	Column name of the feature variable for the plot. TYPE: `str`
`highlight`	The feature value whose bar should be highlighted in the diagram, by default None. TYPE: `Any` DEFAULT: `None`
`highlight_color`	The color to use for highlighting, by default `COLOR.BAD`. TYPE: `str` DEFAULT: `BAD`
`highlighted_as_last`	Whether the highlighted bar should be at the end, by default True. TYPE: `bool` DEFAULT: `True`
`no_percentage_line`	Whether to draw a line as cumulative percentage, by default True TYPE: `bool` DEFAULT: `False`
`width`	Width of the bars, by default `CATEGORY.FEATURE_SPACE`. TYPE: `float` DEFAULT: `FEATURE_SPACE`
`method`	A pandas Series method to use for aggregating target values within each feature level. Like 'sum', 'count' or similar that returns a scalar, by default None. TYPE: `str` DEFAULT: `None`
`kw_method`	Additional keyword arguments to be passed to the method, by default {}. TYPE: `dict` DEFAULT: `{}`
`skip_na`	Flag indicating whether to skip missing values in the feature grouped data, by default None - None, no missing values are skipped - all', grouped data is skipped if all values are missing - any', grouped data is skipped if any value is missing TYPE: `Literal['none', 'all', 'any']` DEFAULT: `None`
`target_on_y`	Flag indicating whether the target variable is plotted on the y-axis, by default True. TYPE: `bool` DEFAULT: `True`
`color`	Color to be used to draw the artists. If None, the first color is taken from the color cycle, by default None. TYPE: `str \| None` DEFAULT: `None`
`ax`	The axes object for the plot. If None, the current axes is fetched using `plt.gca()`. If no axes are available, a new one is created. Defaults to None. TYPE: `Axes \| None` DEFAULT: `None`
`visible_spines`	Specifies which spines are visible, the others are hidden. If 'none', no spines are visible. If None, the spines are drawn according to the stylesheet. Defaults to None. TYPE: `Literal['target', 'feature', 'none'] \| None` DEFAULT: `None`
`hide_axis`	Specifies which axes should be hidden. If None, both axes are displayed. Defaults to None. TYPE: `Literal['target', 'feature', 'both'] \| None` DEFAULT: `None`
`**kwds`	Those arguments have no effect. Only serves to catch further arguments that have no use here (occurs when this class is used within chart objects). DEFAULT: `{}`

Examples:

Apply to an existing Axes object:

import pandas as pd
import matplotlib.pyplot as plt
from daspi import Pareto

fig, ax = plt.subplots()
df = pd.DataFrame(dict(
    x = list('abcdefghijklmno'),
    y = list(100/x for x in range(1, 16))))
pareto = Pareto(
    source=df, target='y', feature='x', ax=ax)
pareto()

You can also combine and highlight small frequencies:

import pandas as pd
import matplotlib.pyplot as plt
from daspi import Pareto

fig, ax = plt.subplots()
df = pd.DataFrame(dict(
    x = list('abcdefghijklmno'),
    y = list(100/x for x in range(1, 16))))
low_values = df.y <= 10
df2 = df[~low_values].copy()
df2.loc[len(df)-sum(low_values)] = ('rest', df[low_values].y.sum())
pareto = Pareto(
    source=df2, target='y', feature='x', highlight='rest', 
    highlight_color='#ff000090', highlighted_as_last=True)
pareto()

Apply using the plot method of a DaSPi Chart object:

import daspi as dsp
import pandas as pd

df = pd.DataFrame(dict(
    x = list('abcdefghijklmno'),
    y = list(100/x for x in range(1, 16))))
low_values = df.y <= 10
df2 = df[~low_values].copy()
df2.loc[len(df)-sum(low_values)] = ('rest', df[low_values].y.sum())
chart = dsp.SingleChart(
        source=df2,
        target='y',
        feature='x',
    ).plot(
        dsp.Pareto,
        highlight='rest',
        highlight_color=dsp.COLOR.BAD,
        no_percentage_line=False
    )

RAISES	DESCRIPTION
`AssertionError`	If 'categorical_feature' is True, coming from Chart objects.
`AssertionError`	If an other Axes object in this Figure instance shares the feature axis.

`no_percentage_line = no_percentage_line` `instance-attribute` ¶

Whether to draw the percentage line and the percentage text.

`highlight = highlight` `instance-attribute` ¶

The feature value whose bar should be highlighted in the chart.

`highlight_color = highlight_color` `instance-attribute` ¶

The color to use for highlighting.

`highlighted_as_last = highlighted_as_last` `instance-attribute` ¶

Whether the highlighted bar should be at the end.

`shared_feature_axes` `property` ¶

True if any other ax in this figure shares the feature axes.

`indices` `property` ¶

Get arranged index values to access the target data (from source data) in the order to be plotted.

`x` `property` ¶

Get the values used for the x-axis so that the target is displayed in descending order and the highlighted bar is at the end (if so specified).

`y` `property` ¶

Get the values used for the y-axis so that the target is displayed in descending order and the highlighted bar is at the end (if so specified).

`add_percentage_texts()` ¶

Add percentage texts on top of major grids

Pareto

no_percentage_line = no_percentage_line instance-attribute ¶

highlight = highlight instance-attribute ¶

highlight_color = highlight_color instance-attribute ¶

highlighted_as_last = highlighted_as_last instance-attribute ¶

shared_feature_axes property ¶

indices property ¶

x property ¶

y property ¶

add_percentage_texts() ¶