Adding a median line to a strip chart in Altair

altair
Author

ChuckPR

Published

February 18, 2025

I find it a bit tricky to add a mark representing an aggregate value to strip plots in Altair. Below is an example using mark_tick to show the median value for IMDB_Rating within each movie genre using the Vega datasets movies data.

import altair as alt
from vega_datasets import data

source = data.movies.url

The chart will have two layers (one for each marking, points and ticks). Each layer will use the same data source so we can set up a shared “base.”

base = alt.Chart(source, height=alt.Step(25), width=500).transform_calculate(
1    jitter="sqrt(-2*log(random()))*cos(2*PI*random())"
)
1
I define the jitter field here because it seems to interfere with the sorting if I define it as part of the points layer below.

The X field is the same for sorting and the points/ticks x-encoding so I’ll make a variable to store and re-use it. Also, I’ll define how the chart is sorted and some axis settings to be applied to both the X and Y-axis.

x_field = "IMDB_Rating"
type_ = "Q"
1sort = alt.EncodingSortField(field=x_field, op="median", order="descending")
axis_kwargs = {"titleFontSize": 14, "titlePadding": 15, "labelFontSize": 12}
1
Sort by an aggregate field value.

Below are the chart layers. In the ticks layer we are using the aggregate option to calculate the median value within each movie genre.

points = base.mark_circle(
    size=20, stroke="steelblue", strokeOpacity=0.5, fill="steelblue", fillOpacity=0.15
).encode(
    y=alt.Y("Major_Genre:N").sort(sort).axis(title=None, **axis_kwargs),
    x=alt.X(f"{x_field}:{type_}"),
    yOffset=alt.YOffset("jitter:Q"),
)

ticks = base.mark_tick(stroke="firebrick", strokeOpacity=0.85, thickness=1.5).encode(
    y=alt.Y("Major_Genre:N").sort(sort),
    x=alt.X(f"{x_field}:{type_}", aggregate="median").axis(
        title="IMDB Rating", **axis_kwargs
    ),
)

points + ticks