plotnine

Language: Python

Data Science

plotnine was created by Claus Wilke and contributors to bring the powerful and expressive Grammar of Graphics approach from R’s ggplot2 to Python. It integrates tightly with Pandas DataFrames, enabling users to create aesthetically pleasing and complex visualizations in Python with a concise syntax.

plotnine is a Python data visualization library based on the Grammar of Graphics, similar to ggplot2 in R. It allows building complex plots by layering components such as data, aesthetics, and geometric objects.

Installation

pip: pip install plotnine
conda: conda install -c conda-forge plotnine

Usage

plotnine allows you to construct plots by mapping data variables to aesthetics, adding layers for geoms, facets, scales, and themes. It supports line plots, scatter plots, bar charts, histograms, boxplots, and more, with full customization options.

Simple scatter plot

from plotnine import ggplot, aes, geom_point
import pandas as pd
df = pd.DataFrame({'x':[1,2,3,4], 'y':[5,7,9,6]})
plot = ggplot(df, aes('x','y')) + geom_point()
print(plot)

Creates a simple scatter plot mapping 'x' and 'y' from a Pandas DataFrame.

Bar plot

from plotnine import ggplot, aes, geom_bar
import pandas as pd
df = pd.DataFrame({'category':['A','B','C'], 'value':[10,20,15]})
plot = ggplot(df, aes(x='category', y='value')) + geom_bar(stat='identity')
print(plot)

Creates a bar chart from a DataFrame using geom_bar with `stat='identity'` to use actual values.

Line plot with color grouping

from plotnine import ggplot, aes, geom_line
import pandas as pd
df = pd.DataFrame({'x':[1,2,3,1,2,3], 'y':[2,3,4,5,6,7], 'group':['A','A','A','B','B','B']})
plot = ggplot(df, aes('x','y', color='group')) + geom_line()
print(plot)

Plots multiple lines colored by group using the color aesthetic.

Adding facets

from plotnine import facet_wrap
plot = ggplot(df, aes('x','y')) + geom_point() + facet_wrap('~group')
print(plot)

Splits data into multiple panels based on the 'group' column using facet_wrap.

Customizing themes

from plotnine import theme_bw, theme
plot = ggplot(df, aes('x','y')) + geom_point() + theme_bw() + theme(figure_size=(6,4))
print(plot)

Applies a black-and-white theme and sets figure size.

Histograms

from plotnine import geom_histogram
plot = ggplot(df, aes('x')) + geom_histogram(binwidth=1, fill='blue', color='black')
print(plot)

Creates a histogram with specified bin width and styling.

Error Handling

ValueError: Column not found: Ensure the column names in the DataFrame match those used in `aes()`.
TypeError: geom_x() missing 1 required positional argument: Check that all required aesthetics for the geom are specified.
ImportError: No module named 'plotnine': Install plotnine using pip or conda before importing.

Best Practices

Always use Pandas DataFrames for data input for full compatibility.

Build plots incrementally using layers (geoms, scales, facets, themes).

Use themes for consistent styling across multiple plots.

Label axes and titles for clarity.

Leverage facets for comparing subsets of data.