In a box plot, rows of
data_frame
are grouped together into a
box-and-whisker mark to visualize their distribution.
Each box spans from quartile 1 (Q1) to quartile 3 (Q3). The second
quartile (Q2) is marked by a line inside the box. By default, the
whiskers correspond to the box’ edges +/- 1.5 times the interquartile
range (IQR: Q3-Q1), see “points” for other options.
Parameters
data_frame
(
DataFrame
or
array-like
or
dict
) – This argument needs to be passed for column names (and not keyword
names) to be used. Array-like and dict are transformed internally to a
pandas DataFrame. Optional: if missing, a DataFrame gets constructed
under the hood using the other arguments.
x
(
str
or
int
or
Series
or
array-like
) – Either a name of a column in
data_frame
, or a pandas Series or
array_like object. Values from this column or array_like are used to
position marks along the x axis in cartesian coordinates. Either
x
or
y
can optionally be a list of column references or array_likes, in
which case the data will be treated as if it were ‘wide’ rather than
‘long’.
y
(
str
or
int
or
Series
or
array-like
) – Either a name of a column in
data_frame
, or a pandas Series or
array_like object. Values from this column or array_like are used to
position marks along the y axis in cartesian coordinates. Either
x
or
y
can optionally be a list of column references or array_likes, in
which case the data will be treated as if it were ‘wide’ rather than
‘long’.
color
(
str
or
int
or
Series
or
array-like
) – Either a name of a column in
data_frame
, or a pandas Series or
array_like object. Values from this column or array_like are used to
assign color to marks.
facet_row
(
str
or
int
or
Series
or
array-like
) – Either a name of a column in
data_frame
, or a pandas Series or
array_like object. Values from this column or array_like are used to
assign marks to facetted subplots in the vertical direction.
facet_col
(
str
or
int
or
Series
or
array-like
) – Either a name of a column in
data_frame
, or a pandas Series or
array_like object. Values from this column or array_like are used to
assign marks to facetted subplots in the horizontal direction.
facet_col_wrap
(
int
) – Maximum number of facet columns. Wraps the column variable at this
width, so that the column facets span multiple rows. Ignored if 0, and
forced to 0 if
facet_row
or a
marginal
is set.
facet_row_spacing
(
float between 0 and 1
) – Spacing between facet rows, in paper units. Default is 0.03 or 0.0.7
when facet_col_wrap is used.
facet_col_spacing
(
float between 0 and 1
) – Spacing between facet columns, in paper units Default is 0.02.
hover_name
(
str
or
int
or
Series
or
array-like
) – Either a name of a column in
data_frame
, or a pandas Series or
array_like object. Values from this column or array_like appear in bold
in the hover tooltip.
hover_data
(
str
, or
list of str
or
int
, or
Series
or
array-like
, or
dict
) – Either a name or list of names of columns in
data_frame
, or pandas
Series, or array_like objects or a dict with column names as keys, with
values True (for default formatting) False (in order to remove this
column from hover information), or a formatting string, for example
‘:.3f’ or ‘
|
%a’ or list-like data to appear in the hover tooltip or
tuples with a bool or formatting string as first element, and list-like
data to appear in hover as second element Values from these columns
appear as extra data in the hover tooltip.
custom_data
(
str
, or
list of str
or
int
, or
Series
or
array-like
) – Either name or list of names of columns in
data_frame
, or pandas
Series, or array_like objects Values from these columns are extra data,
to be used in widgets or Dash callbacks for example. This data is not
user-visible but is included in events emitted by the figure (lasso
selection etc.)
animation_frame
(
str
or
int
or
Series
or
array-like
) – Either a name of a column in
data_frame
, or a pandas Series or
array_like object. Values from this column or array_like are used to
assign marks to animation frames.
animation_group
(
str
or
int
or
Series
or
array-like
) – Either a name of a column in
data_frame
, or a pandas Series or
array_like object. Values from this column or array_like are used to
provide object-constancy across animation frames: rows with matching
`
animation_group`s will be treated as if they describe the same object
in each frame.
category_orders
(dict with str keys and list of str values (default
{}
)) – By default, in Python 3.6+, the order of categorical values in axes,
legends and facets depends on the order in which these values are first
encountered in
data_frame
(and no order is guaranteed by default in
Python below 3.6). This parameter is used to force a specific ordering
of values per column. The keys of this dict should correspond to column
names, and the values should be lists of strings corresponding to the
specific display order desired.
labels
(dict with str keys and str values (default
{}
)) – By default, column names are used in the figure for axis titles, legend
entries and hovers. This parameter allows this to be overridden. The
keys of this dict should correspond to column names, and the values
should correspond to the desired label to be displayed.
color_discrete_sequence
(
list of str
) – Strings should define valid CSS-colors. When
color
is set and the
values in the corresponding column are not numeric, values in that
column are assigned colors by cycling through
color_discrete_sequence
in the order described in
category_orders
, unless the value of
color
is a key in
color_discrete_map
. Various useful color
sequences are available in the
plotly.express.colors
submodules,
specifically
plotly.express.colors.qualitative
.
color_discrete_map
(dict with str keys and str values (default
{}
)) – String values should define valid CSS-colors Used to override
color_discrete_sequence
to assign a specific colors to marks
corresponding with specific values. Keys in
color_discrete_map
should
be values in the column denoted by
color
. Alternatively, if the
values of
color
are valid colors, the string
'identity'
may be
passed to cause them to be used directly.
orientation
(str, one of
'h'
for horizontal or
'v'
for vertical.) – (default
'v'
if
x
and
y
are provided and both continous or both
categorical, otherwise
'v'`(
‘h’
)
if
`x`(`y
) is categorical and
y`(`x
) is continuous, otherwise
'v'`(
‘h’
)
if
only
`x`(`y
) is
provided)
boxmode
(str (default
'group'
)) – One of
'group'
or
'overlay'
In
'overlay'
mode, boxes are on drawn
top of one another. In
'group'
mode, boxes are placed beside each
other.
log_x
(boolean (default
False
)) – If
True
, the x-axis is log-scaled in cartesian coordinates.
log_y
(boolean (default
False
)) – If
True
, the y-axis is log-scaled in cartesian coordinates.
range_x
(
list of two numbers
) – If provided, overrides auto-scaling on the x-axis in cartesian
coordinates.
range_y
(
list of two numbers
) – If provided, overrides auto-scaling on the y-axis in cartesian
coordinates.
points
(str or boolean (default
'outliers'
)) – One of
'outliers'
,
'suspectedoutliers'
,
'all'
, or
False
. If
'outliers'
, only the sample points lying outside the whiskers are
shown. If
'suspectedoutliers'
, all outlier points are shown and those
less than 4*Q1-3*Q3 or greater than 4*Q3-3*Q1 are highlighted with the
marker’s
'outliercolor'
. If
'outliers'
, only the sample points
lying outside the whiskers are shown. If
'all'
, all sample points are
shown. If
False
, no sample points are shown and the whiskers extend
to the full range of the sample.
notched
(boolean (default
False
)) – If
True
, boxes are drawn with notches.
title
(
str
) – The figure title.
template
(
str
or
dict
or
plotly.graph_objects.layout.Template instance
) – The figure template name (must be a key in plotly.io.templates) or
definition.
width
(int (default
None
)) – The figure width in pixels.
height
(int (default
None
)) – The figure height in pixels.
Returns
Return type
plotly.graph_objects.Figure