1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215
|
v0.11.0 (September 2020)
------------------------
.. image:: https://zenodo.org/badge/DOI/10.5281/zenodo.4019146.svg
:target: https://doi.org/10.5281/zenodo.4019146
This is a major release with several important new features, enhancements to existing functions, and changes to the library. Highlights include an overhaul and modernization of the distributions plotting functions, more flexible data specification, new colormaps, and better narrative documentation.
For an overview of the new features and a guide to updating, see `this Medium post <https://medium.com/@michaelwaskom/announcing-the-release-of-seaborn-0-11-3df0341af042?source=friends_link&sk=85146c0b2f01d2b41d214f8c3835b697>`_.
Required keyword arguments
~~~~~~~~~~~~~~~~~~~~~~~~~~
|API|
Most plotting functions now require all of their parameters to be specified using keyword arguments. To ease adaptation, code without keyword arguments will trigger a ``FutureWarning`` in v0.11. In a future release (v0.12 or v0.13, depending on release cadence), this will become an error. Once keyword arguments are fully enforced, the signature of the plotting functions will be reorganized to accept ``data`` as the first and only positional argument (:pr:`2052,2081`).
Modernization of distribution functions
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The distribution module has been completely overhauled, modernizing the API and introducing several new functions and features within existing functions. Some new features are explained here; the :doc:`tutorial documentation <tutorial/distributions>` has also been rewritten and serves as a good introduction to the functions.
New plotting functions
^^^^^^^^^^^^^^^^^^^^^^
|Feature| |Enhancement|
First, three new functions, :func:`displot`, :func:`histplot` and :func:`ecdfplot` have been added (:pr:`2157`, :pr:`2125`, :pr:`2141`).
The figure-level :func:`displot` function is an interface to the various distribution plots (analogous to :func:`relplot` or :func:`catplot`). It can draw univariate or bivariate histograms, density curves, ECDFs, and rug plots on a :class:`FacetGrid`.
The axes-level :func:`histplot` function draws univariate or bivariate histograms with a number of features, including:
- mapping multiple distributions with a ``hue`` semantic
- normalization to show density, probability, or frequency statistics
- flexible parameterization of bin size, including proper bins for discrete variables
- adding a KDE fit to show a smoothed distribution over all bin statistics
- experimental support for histograms over categorical and datetime variables.
The axes-level :func:`ecdfplot` function draws univariate empirical cumulative distribution functions, using a similar interface.
Changes to existing functions
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|API| |Feature| |Enhancement| |Defaults|
Second, the existing functions :func:`kdeplot` and :func:`rugplot` have been completely overhauled (:pr:`2060,2104`).
The overhauled functions now share a common API with the rest of seaborn, they can show conditional distributions by mapping a third variable with a ``hue`` semantic, and they have been improved in numerous other ways. The github pull request (:pr:`2104`) has a longer explanation of the changes and the motivation behind them.
This is a necessarily API-breaking change. The parameter names for the positional variables are now ``x`` and ``y``, and the old names have been deprecated. Efforts were made to handle and warn when using the deprecated API, but it is strongly suggested to check your plots carefully.
Additionally, the statsmodels-based computation of the KDE has been removed. Because there were some inconsistencies between the way different parameters (specifically, ``bw``, ``clip``, and ``cut``) were implemented by each backend, this may cause plots to look different with non-default parameters. Support for using non-Gaussian kernels, which was available only in the statsmodels backend, has been removed.
Other new features include:
- several options for representing multiple densities (using the ``multiple`` and ``common_norm`` parameters)
- weighted density estimation (using the new ``weights`` parameter)
- better control over the smoothing bandwidth (using the new ``bw_adjust`` parameter)
- more meaningful parameterization of the contours that represent a bivariate density (using the ``thresh`` and ``levels`` parameters)
- log-space density estimation (using the new ``log_scale`` parameter, or by scaling the data axis before plotting)
- "bivariate" rug plots with a single function call (by assigning both ``x`` and ``y``)
Deprecations
^^^^^^^^^^^^
|API|
Finally, the :func:`distplot` function is now formally deprecated. Its features have been subsumed by :func:`displot` and :func:`histplot`. Some effort was made to gradually transition :func:`distplot` by adding the features in :func:`displot` and handling backwards compatibility, but this proved to be too difficult. The similarity in the names will likely cause some confusion during the transition, which is regrettable.
Related enhancements and changes
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|API| |Feature| |Enhancement| |Defaults|
These additions facilitated new features (and forced changes) in :func:`jointplot` and :class:`JointGrid` (:pr:`2210`) and in :func:`pairplot` and :class:`PairGrid` (:pr:`2234`).
- Added support for the ``hue`` semantic in :func:`jointplot`/:class:`JointGrid`. This support is lightweight and simply delegates the mapping to the underlying axes-level functions.
- Delegated the handling of ``hue`` in :class:`PairGrid`/:func:`pairplot` to the plotting function when it understands ``hue``, meaning that (1) the zorder of scatterplot points will be determined by row in dataframe, (2) additional options for resolving hue (e.g. the ``multiple`` parameter) can be used, and (3) numeric hue variables can be naturally mapped when using :func:`scatterplot`.
- Added ``kind="hist"`` to :func:`jointplot`, which draws a bivariate histogram on the joint axes and univariate histograms on the marginal axes, as well as both ``kind="hist"`` and ``kind="kde"`` to :func:`pairplot`, which behaves likewise.
- The various modes of :func:`jointplot` that plot marginal histograms now use :func:`histplot` rather than :func:`distplot`. This slightly changes the default appearance and affects the valid keyword arguments that can be passed to customize the plot. Likewise, the marginal histogram plots in :func:`pairplot` now use :func:`histplot`.
Standardization and enhancements of data ingest
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|Feature| |Enhancement| |Docs|
The code that processes input data has been refactored and enhanced. In v0.11, this new code takes effect for the relational and distribution modules; other modules will be refactored to use it in future releases (:pr:`2071`).
These changes should be transparent for most use-cases, although they allow a few new features:
- Named variables for long-form data can refer to the named index of a :class:`pandas.DataFrame` or to levels in the case of a multi-index. Previously, it was necessary to call :meth:`pandas.DataFrame.reset_index` before using index variables (e.g., after a groupby operation).
- :func:`relplot` now has the same flexibility as the axes-level functions to accept data in long- or wide-format and to accept data vectors (rather than named variables) in long-form mode.
- The data parameter can now be a Python ``dict`` or an object that implements that interface. This is a new feature for wide-form data. For long-form data, it was previously supported but not documented.
- A wide-form data object can have a mixture of types; the non-numeric types will be removed before plotting. Previously, this caused an error.
- There are better error messages for other instances of data mis-specification.
See the new user guide chapter on :doc:`data formats <tutorial/data_structure>` for more information about what is supported.
Other changes
~~~~~~~~~~~~~
Documentation improvements
^^^^^^^^^^^^^^^^^^^^^^^^^^
- |Docs| Added two new chapters to the user guide, one giving an overview of the :doc:`types of functions in seaborn <tutorial/function_overview>`, and one discussing the different :doc:`data formats <tutorial/data_structure>` that seaborn understands.
- |Docs| Expanded the :doc:`color palette tutorial <tutorial/color_palettes>` to give more background on color theory and better motivate the use of color in statistical graphics.
- |Docs| Added more information to the :doc:`installation guidelines <installing>` and streamlined the :doc:`introduction <introduction>` page.
- |Docs| Improved cross-linking within the seaborn docs and between the seaborn and matplotlib docs.
Theming
^^^^^^^
- |API| The :func:`set` function has been renamed to :func:`set_theme` for more clarity about what it does. For the foreseeable future, :func:`set` will remain as an alias, but it is recommended to update your code.
Relational plots
^^^^^^^^^^^^^^^^
- |Enhancement| |Defaults| Reduced some of the surprising behavior of relational plot legends when using a numeric hue or size mapping (:pr:`2229`):
- Added an "auto" mode (the new default) that chooses between "brief" and "full" legends based on the number of unique levels of each variable.
- Modified the ticking algorithm for a "brief" legend to show up to 6 values and not to show values outside the limits of the data.
- Changed the approach to the legend title: the normal matplotlib legend title is used when only one variable is assigned a semantic mapping, whereas the old approach of adding an invisible legend artist with a subtitle label is used only when multiple semantic variables are defined.
- Modified legend subtitles to be left-aligned and to be drawn in the default legend title font size.
- |Enhancement| |Defaults| Changed how functions that use different representations for numeric and categorical data handle vectors with an ``object`` data type. Previously, data was considered numeric if it could be coerced to a float representation without error. Now, object-typed vectors are considered numeric only when their contents are themselves numeric. As a consequence, numbers that are encoded as strings will now be treated as categorical data (:pr:`2084`).
- |Enhancement| |Defaults| Plots with a ``style`` semantic can now generate an infinite number of unique dashes and/or markers by default. Previously, an error would be raised if the ``style`` variable had more levels than could be mapped using the default lists. The existing defaults were slightly modified as part of this change; if you need to exactly reproduce plots from earlier versions, refer to the `old defaults <https://github.com/mwaskom/seaborn/blob/v0.10.1/seaborn/relational.py#L24>`_ (:pr:`2075`).
- |Defaults| Changed how :func:`scatterplot` sets the default linewidth for the edges of the scatter points. New behavior is to scale with the point sizes themselves (on a plot-wise, not point-wise basis). This change also slightly reduces the default width when point sizes are not varied. Set ``linewidth=0.75`` to reproduce the previous behavior. (:pr:`2708`).
- |Enhancement| Improved support for datetime variables in :func:`scatterplot` and :func:`lineplot` (:pr:`2138`).
- |Fix| Fixed a bug where :func:`lineplot` did not pass the ``linestyle`` parameter down to matplotlib (:pr:`2095`).
- |Fix| Adapted to a change in matplotlib that prevented passing vectors of literal values to ``c`` and ``s`` in :func:`scatterplot` (:pr:`2079`).
Categorical plots
^^^^^^^^^^^^^^^^^
- |Enhancement| |Defaults| |Fix| Fixed a few computational issues in :func:`boxenplot` and improved its visual appearance (:pr:`2086`):
- Changed the default method for computing the number of boxes to``k_depth="tukey"``, as the previous default (``k_depth="proportion"``) is based on a heuristic that produces too many boxes for small datasets.
- Added the option to specify the specific number of boxes (e.g. ``k_depth=6``) or to plot boxes that will cover most of the data points (``k_depth="full"``).
- Added a new parameter, ``trust_alpha``, to control the number of boxes when ``k_depth="trustworthy"``.
- Changed the visual appearance of :func:`boxenplot` to more closely resemble :func:`boxplot`. Notably, thin boxes will remain visible when the edges are white.
- |Enhancement| Allowed :func:`catplot` to use different values on the categorical axis of each facet when axis sharing is turned off (e.g. by specifying ``sharex=False``) (:pr:`2196`).
- |Enhancement| Improved the error messages produced when categorical plots process the orientation parameter.
- |Enhancement| Added an explicit warning in :func:`swarmplot` when more than 5% of the points overlap in the "gutters" of the swarm (:pr:`2045`).
Multi-plot grids
^^^^^^^^^^^^^^^^
- |Feature| |Enhancement| |Defaults| A few small changes to make life easier when using :class:`PairGrid` (:pr:`2234`):
- Added public access to the legend object through the ``legend`` attribute (also affects :class:`FacetGrid`).
- The ``color`` and ``label`` parameters are no longer passed to the plotting functions when ``hue`` is not used.
- The data is no longer converted to a numpy object before plotting on the marginal axes.
- It is possible to specify only one of ``x_vars`` or ``y_vars``, using all variables for the unspecified dimension.
- The ``layout_pad`` parameter is stored and used every time you call the :meth:`PairGrid.tight_layout` method.
- |Feature| Added a ``tight_layout`` method to :class:`FacetGrid` and :class:`PairGrid`, which runs the :func:`matplotlib.pyplot.tight_layout` algorithm without interference from the external legend (:pr:`2073`).
- |Feature| Added the ``axes_dict`` attribute to :class:`FacetGrid` for named access to the component axes (:pr:`2046`).
- |Enhancement| Made :meth:`FacetGrid.set_axis_labels` clear labels from "interior" axes (:pr:`2046`).
- |Feature| Added the ``marginal_ticks`` parameter to :class:`JointGrid` which, if set to ``True``, will show ticks on the count/density axis of the marginal plots (:pr:`2210`).
- |Enhancement| Improved :meth:`FacetGrid.set_titles` with ``margin_titles=True``, such that texts representing the original row titles are removed before adding new ones (:pr:`2083`).
- |Defaults| Changed the default value for ``dropna`` to ``False`` in :class:`FacetGrid`, :class:`PairGrid`, :class:`JointGrid`, and corresponding functions. As all or nearly all seaborn and matplotlib plotting functions handle missing data well, this option is no longer useful, but it causes problems in some edge cases. It may be deprecated in the future. (:pr:`2204`).
- |Fix| Fixed a bug in :class:`PairGrid` that appeared when setting ``corner=True`` and ``despine=False`` (:pr:`2203`).
Color palettes
~~~~~~~~~~~~~~
- |Docs| Improved and modernized the :doc:`color palettes chapter <tutorial/color_palettes>` of the seaborn tutorial.
- |Feature| Added two new perceptually-uniform colormaps: "flare" and "crest". The new colormaps are similar to "rocket" and "mako", but their luminance range is reduced. This makes them well suited to numeric mappings of line or scatter plots, which need contrast with the axes background at the extremes (:pr:`2237`).
- |Enhancement| |Defaults| Enhanced numeric colormap functionality in several ways (:pr:`2237`):
- Added string-based access within the :func:`color_palette` interface to :func:`dark_palette`, :func:`light_palette`, and :func:`blend_palette`. This means that anywhere you specify a palette in seaborn, a name like ``"dark:blue"`` will use :func:`dark_palette` with the input ``"blue"``.
- Added the ``as_cmap`` parameter to :func:`color_palette` and changed internal code that uses a continuous colormap to take this route.
- Tweaked the :func:`light_palette` and :func:`dark_palette` functions to use an endpoint that is a very desaturated version of the input color, rather than a pure gray. This produces smoother ramps. To exactly reproduce previous plots, use :func:`blend_palette` with ``".13"`` for dark or ``".95"`` for light.
- Changed :func:`diverging_palette` to have a default value of ``sep=1``, which gives better results.
- |Enhancement| Added a rich HTML representation to the object returned by :func:`color_palette` (:pr:`2225`).
- |Fix| Fixed the ``"{palette}_d"`` logic to modify reversed colormaps and to use the correct direction of the luminance ramp in both cases.
Deprecations and removals
^^^^^^^^^^^^^^^^^^^^^^^^^
- |Enhancement| Removed an optional (and undocumented) dependency on BeautifulSoup (:pr:`2190`) in :func:`get_dataset_names`.
- |API| Deprecated the ``axlabel`` function; use ``ax.set(xlabel=, ylabel=)`` instead.
- |API| Deprecated the ``iqr`` function; use :func:`scipy.stats.iqr` instead.
- |API| Final removal of the previously-deprecated ``annotate`` method on :class:`JointGrid`, along with related parameters.
- |API| Final removal of the ``lvplot`` function (the previously-deprecated name for :func:`boxenplot`).
|