Pareto
daspi.plotlib.plotter.Pareto(source, target, feature, highlight=None, highlight_color=COLOR.BAD, highlighted_as_last=True, no_percentage_line=False, width=CATEGORY.FEATURE_SPACE, method=None, kw_method={}, skip_na=None, target_on_y=True, color=None, ax=None, visible_spines=None, hide_axis=None, **kwds)
¶
Bases: Bar
A plotter to perform a pareto chart that extends the Bar plotter class.
A Pareto chart is a type of chart that combines a bar graph and a line graph. It is used to display and analyze data in order to prioritize and identify the most significant factors contributing to a particular phenomenon or problem. The line graph in a Pareto chart shows the cumulative percentage of the total, which helps identify the point at which a significant portion of the cumulative total is reached.
Pareto charts are commonly used in quality control, process improvement, and decision-making processes. They allow users to visually identify and focus on the most significant factors that contribute to a problem or outcome, enabling them to allocate resources and address the most critical issues first.
| PARAMETER | DESCRIPTION |
|---|---|
source
|
Pandas long format DataFrame containing the data source for the plot.
TYPE:
|
target
|
Column name of the target variable for the plot.
TYPE:
|
feature
|
Column name of the feature variable for the plot.
TYPE:
|
highlight
|
The feature value whose bar should be highlighted in the diagram, by default None.
TYPE:
|
highlight_color
|
The color to use for highlighting, by default
TYPE:
|
highlighted_as_last
|
Whether the highlighted bar should be at the end, by default True.
TYPE:
|
no_percentage_line
|
Whether to draw a line as cumulative percentage, by default True
TYPE:
|
width
|
Width of the bars, by default
TYPE:
|
method
|
A pandas Series method to use for aggregating target values within each feature level. Like 'sum', 'count' or similar that returns a scalar, by default None.
TYPE:
|
kw_method
|
Additional keyword arguments to be passed to the method, by default {}.
TYPE:
|
skip_na
|
Flag indicating whether to skip missing values in the feature grouped data, by default None - None, no missing values are skipped - all', grouped data is skipped if all values are missing - any', grouped data is skipped if any value is missing
TYPE:
|
target_on_y
|
Flag indicating whether the target variable is plotted on the y-axis, by default True.
TYPE:
|
color
|
Color to be used to draw the artists. If None, the first color is taken from the color cycle, by default None.
TYPE:
|
ax
|
The axes object for the plot. If None, the current axes is
fetched using
TYPE:
|
visible_spines
|
Specifies which spines are visible, the others are hidden. If 'none', no spines are visible. If None, the spines are drawn according to the stylesheet. Defaults to None.
TYPE:
|
hide_axis
|
Specifies which axes should be hidden. If None, both axes are displayed. Defaults to None.
TYPE:
|
**kwds
|
Those arguments have no effect. Only serves to catch further arguments that have no use here (occurs when this class is used within chart objects).
DEFAULT:
|
Examples:
Apply to an existing Axes object:
import pandas as pd
import matplotlib.pyplot as plt
from daspi import Pareto
fig, ax = plt.subplots()
df = pd.DataFrame(dict(
x = list('abcdefghijklmno'),
y = list(100/x for x in range(1, 16))))
pareto = Pareto(
source=df, target='y', feature='x', ax=ax)
pareto()
You can also combine and highlight small frequencies:
import pandas as pd
import matplotlib.pyplot as plt
from daspi import Pareto
fig, ax = plt.subplots()
df = pd.DataFrame(dict(
x = list('abcdefghijklmno'),
y = list(100/x for x in range(1, 16))))
low_values = df.y <= 10
df2 = df[~low_values].copy()
df2.loc[len(df)-sum(low_values)] = ('rest', df[low_values].y.sum())
pareto = Pareto(
source=df2, target='y', feature='x', highlight='rest',
highlight_color='#ff000090', highlighted_as_last=True)
pareto()
Apply using the plot method of a DaSPi Chart object:
import daspi as dsp
import pandas as pd
df = pd.DataFrame(dict(
x = list('abcdefghijklmno'),
y = list(100/x for x in range(1, 16))))
low_values = df.y <= 10
df2 = df[~low_values].copy()
df2.loc[len(df)-sum(low_values)] = ('rest', df[low_values].y.sum())
chart = dsp.SingleChart(
source=df2,
target='y',
feature='x',
).plot(
dsp.Pareto,
highlight='rest',
highlight_color=dsp.COLOR.BAD,
no_percentage_line=False
)
| RAISES | DESCRIPTION |
|---|---|
AssertionError
|
If 'categorical_feature' is True, coming from Chart objects. |
AssertionError
|
If an other Axes object in this Figure instance shares the feature axis. |
no_percentage_line = no_percentage_line
instance-attribute
¶
Whether to draw the percentage line and the percentage text.
highlight = highlight
instance-attribute
¶
The feature value whose bar should be highlighted in the chart.
highlight_color = highlight_color
instance-attribute
¶
The color to use for highlighting.
highlighted_as_last = highlighted_as_last
instance-attribute
¶
Whether the highlighted bar should be at the end.
shared_feature_axes
property
¶
True if any other ax in this figure shares the feature axes.
indices
property
¶
Get arranged index values to access the target data (from source data) in the order to be plotted.
x
property
¶
Get the values used for the x-axis so that the target is displayed in descending order and the highlighted bar is at the end (if so specified).
y
property
¶
Get the values used for the y-axis so that the target is displayed in descending order and the highlighted bar is at the end (if so specified).
add_percentage_texts()
¶
Add percentage texts on top of major grids