A microbiologist with a data science problem

Making FiveThirtyEight-like tables in Jupyter with Jinja2 and Pandas

Mon 04 May 2020

I love the tables on FiveThirtyEight. I often want to similarly visualize a Pandas DataFrame in Jupyter. In this post we'll take a shot at creating a FiveThirtyEight-styled table in a Jupyter notebook. Let's make a something like this table of 2020 congressional race polls. Pandas has a great styling API but it's not enough to achieve all the styling touches that FiveThirtyEight uses. Instead of using the Pandas styling API, we'll create an HTML table from scratch using a Jinja template that will accept polls data and return a styled HTML table.

Here's a quick demonstration.

from datetime import datetime
from IPython.display import HTML
from jinja2 import Template
import numpy as np
import pandas as pd

Let's make a little toy DataFrame. This is how the DataFrame is rendered by default:

Every now and then I'll be applying a CSS class from the Tachyons CSS framework which I use to style this blog. In the table below I'm adding a Tachyons class "collapse" to the HTML output from Pandas.

df = pd.DataFrame(
    dict(
        letter=['a', 'b', 'b', 'a'],
        other_data=[3, 56, 3, 1]
    )
)
HTML(df.to_html(classes="collapse"))
Out[2]:
letter other_data
0 a 3
1 b 56
2 b 3
3 a 1

Not very aesthetically interesting...

Let's say we want to draw a red circle around all the "a" values. We can do this easily with HTML and CSS.

%%html
<div style="border: thin solid red; border-radius: 50%; width: 24px; text-align: center">
a
</div>
a

But we don't want to hand code HTML for an entire table. This is where Jinja comes in. We can create a Jinja template for our HTML table that will loop through our data and output the full table. The code below builds a Jinja template that accepts rows — which is a list of dictionaries representing our DataFrame rows — and, columns — which is a list that represents our DataFrame column names. The template takes the input data and returns an HTML table. For a review of the Jinja template syntax check out the documentation. Templating systems are incredibly powerful so it's well worth your time.

I added the collapse class again to the table which is defined in the CSS framework I'm using on this blog. The collapse class adjusts the cell/table borders so they aren't doubled in appearance.

template_str = '''
<table class="example collapse">
  <thead>
    <tr>
      {% for c in columns %}
      <th>{{ c }}</th>
      {% endfor %}
    </tr>
  </thead>
  <tbody>
     {% for row in rows %}
     <tr>
     {% for k, v in row.items() %}
     {% if v == 'a' %}
     <td><div class="red-circle">{{ row.letter }}</div></td>
     {% else %}
     <td><div>{{ v }}</div></td>
     {% endif %}
     {% endfor %}
     </tr>
     {% endfor %}
   </tr>
  </tbody>
</table>
'''

I think the HTML table markup is pretty intuitive: nested in <table> we have a table header <thead> and a body <tbody> and so on. The Jinja magic is happening in the curly braced bits (e.g. {% for k, v in row.items() %}). Jinja lets us use Python syntax to generate HTML. CSS, on the other hand, feels pretty foreign at first but you get the hang of it. Let's append some CSS styles to our template. The styles below are what make the red-circle divs (see above) actually into red circles.

template_str += '''
<style>

table.example {
  border: thin solid lightgray;
}

table.example th,
table.example td {
  border: thin solid lightgray;
  min-width: 75px;
  text-align: center;
  padding: 5px;
}

table.example .red-circle {
  border: thin solid red;
  width: 24px;
  height: 24px;
  border-radius: 50%;
  text-align: center;
  margin: auto;
}
</style>
'''

All that's left is to feed the data from our toy DataFrame to the template. Again, this is pretty intuitive syntax (one of the great features of Jinja).

template = Template(template_str)

html = template.render(
    rows=df.to_dict(orient='records'),
    columns=df.columns.to_list()
)

HTML(html)
Out[6]:
letter other_data
a
3
b
56
b
3
a
1

The FiveThiryEight example

Ok we've got the gist of it now let's apply the same technique to re-create a FiveThirtyEight-like table. I'm going to try to make this table of polls data for 2020 US congressional elections. I prepared a Jinja template, fivethirtyeight.tpl, that we'll use to create our table (I've shared the template in this GitHub gist). Before we can use the template, we have to prepare our data. The polls data is linked right in the FiveThiryEight page (super convenient!).

Below are some formatting functions I'm applying to the FiveThirtyEight polls data to prep it for our Jinja template. I'm just calculating colors and "prettifying" other values.

def format_dates(start, end):
    fmt = '%m/%d/%Y'
    start = datetime.strptime(start, fmt)
    end = datetime.strptime(end, fmt)
    if start.strftime('%m/%Y') == end.strftime('%m/%Y'):
        return start.strftime('%b.&nbsp%d-') + end.strftime('%d')
    else:
        return start.strftime('%b.&nbsp%d-') + end.strftime('%b.&nbsp%d')
    
def add_comma(i):
    return '{:,}'.format(i)

def get_leader(d, r, fmt=lambda s: s):
    lead = d - r
    if d > 0:
        party = 'Democrat'
        return f'{fmt(party)}&nbsp+{int(np.round(lead))}'
    else:
        party = 'Republican'
        return f'{fmt(party)}&nbsp+{int(no.round(lead))}'
    
def get_color(v, color='red'):
    N = 5
    bins = np.linspace(34, 53, N)
    alphas = np.linspace(0.1, 0.6, N)
    b = np.digitize(v, bins)
    alpha = alphas[b - 1]
    if color == 'red':
        return f'rgba(255, 0, 0, {alpha})'
    else:
        return f'rgba(0, 0, 255, {alpha})'

Now we'll read our data into a Pandas DataFrame and send it through all the formatting steps.

formatted_df = (
    pd.read_csv('data/generic_polllist.csv')
    .assign(dates=lambda x: [format_dates(s, e) 
                             for s, e in 
                             x[['startdate', 'enddate']].itertuples(index=False)],
            sample=lambda x: x['samplesize'].map(add_comma),
            republican=lambda x: x['rep'].astype(int).astype(str) + '%',
            democrat=lambda x: x['dem'].astype(int).astype(str) + '%',
            leader=lambda x: [get_leader(d, r) for d, r in 
                              x[['dem', 'rep']].itertuples(index=False)],
            adj_leader=lambda x: [get_leader(d, r, lambda s: f'{s[:1]}.') 
                                  for d, r in 
                                  x[['dem', 'rep']].itertuples(index=False)],
            d_color=lambda x: x['dem'].map(lambda y: get_color(y, color='blue')),
            r_color=lambda x: x['rep'].map(get_color),
            weight=lambda x: x['weight'].round(2))
    .fillna('')
)

Finally we can pass the data to our Jinja template and enjoy the show! I'm just displaying fifteen rows below and I've selected one row per pollster.

N_rows = 15

with open('fivethirtyeight.tpl') as fh:
    template = Template(fh.read())

rows = (
    formatted_df
    .groupby('pollster', as_index=False)
    .first()
    .to_dict(orient='records')
)[:N_rows]
    
cols = ['dates', 'pollster', 'grade', 'sample', 'weight', 'republican',
        'democrat', 'leader', 'adjusted leader']    
    
html = template.render(cols=cols, rows=rows)
HTML(html)
Out[9]:
dates
pollster
grade
sample
weight
republican
democrat
leader
adjusted leader
Sep. 26-27 ALG Research/GBAO
1,013 lv
0.84
39%
47%
Democrat +8 D. +8
Jun. 03-06 Basswood Research
B/C
1,200 lv
1.3
42%
44%
Democrat +2 D. +2
Mar. 04-07 CNN/SSRS
A/B
1,084 rv
1.52
44%
53%
Democrat +9 D. +9
Aug. 07-10 Cygnal
A/B
1,263 lv
1.69
39%
46%
Democrat +7 D. +7
Feb. 22-23 D-CYFOR
1,000 rv
0.84
39%
46%
Democrat +7 D. +7
Nov. 20-Dec. 03 Data for Progress
B/C
980 rv
0.88
39%
47%
Democrat +8 D. +8
Jan. 20-21 Emerson College
A-
942 rv
1.4
47%
52%
Democrat +5 D. +5
Mar. 21-28 Firehouse Strategies/Øptimus
C/D
1,032 lv
0.31
36%
47%
Democrat +10 D. +10
Aug. 15-21 GQR Research
C+
2,629 rv
1.13
41%
49%
Democrat +8 D. +8
Mar. 31-Apr. 04 Georgetown University/Battleground
A/B
1,000 lv
1.54
37%
42%
Democrat +5 D. +5
Nov. 01-08 Global Strategy Group
B/C
800 rv
0.88
42%
49%
Democrat +7 D. +7
Dec. 31-Jan. 02 HarrisX
C+
3,012 rv
0.04
35%
45%
Democrat +10 D. +10
Oct. 03-09 Hart Research Associates
B/C
1,000 lv
1.2
38%
47%
Democrat +9 D. +9
Oct. 03-08 Marist College
A+
926 rv
1.7
40%
43%
Democrat +3 D. +3
Feb. 06-10 McLaughlin & Associates
C/D
1,000 lv
0.58
45%
46%
Democrat +1 D. +1

This looks great! I am especially pleased with the little weight icons. It's not trivial to produce such detailed styling but the effort really brings the data to life.

Bonus: styling the table header

Here's how I created the table header. It was tedious work but I was eventually able to wrap my head around how FiveThirtyEight achieves that rotated text effect in table column headers. The rotated column style is produced with a combination of SVG transforms and CSS positioning. We'll begin with the SVG.

You can render SVG directly in your notebook using the SVG magic command but I like to use CodePen for rapid HTML/CSS prototyping.

%%html
<p class="codepen" data-height="248" data-theme-id="light" data-default-tab="html,result" data-user="chuckpr" data-slug-hash="ZEbyjom" style="height: 248px; box-sizing: border-box; display: flex; align-items: center; justify-content: center; border: 2px solid; margin: 1em 0; padding: 1em;" data-pen-title="header-rotation-1">
  <span>See the Pen <a href="https://codepen.io/chuckpr/pen/ZEbyjom">
  header-rotation-1</a> by Charles Pepe-Ranney (<a href="https://codepen.io/chuckpr">@chuckpr</a>)
  on <a href="https://codepen.io">CodePen</a>.</span>
</p>
<script async src="https://static.codepen.io/assets/embed/ei.js"></script>

See the Pen header-rotation-1 by Charles Pepe-Ranney (@chuckpr) on CodePen.

It helps to add some orienting elements to understand how this SVG "works." Let's draw it again with a gray background, and a couple lines meant to show the X (green) and Y (blue) orientation.

%%html
<p class="codepen" data-height="265" data-theme-id="light" data-default-tab="html,result" data-user="chuckpr" data-slug-hash="BaoZPPd" style="height: 265px; box-sizing: border-box; display: flex; align-items: center; justify-content: center; border: 2px solid; margin: 1em 0; padding: 1em;" data-pen-title="header-rotation-2">
  <span>See the Pen <a href="https://codepen.io/chuckpr/pen/BaoZPPd">
  header-rotation-2</a> by Charles Pepe-Ranney (<a href="https://codepen.io/chuckpr">@chuckpr</a>)
  on <a href="https://codepen.io">CodePen</a>.</span>
</p>
<script async src="https://static.codepen.io/assets/embed/ei.js"></script>

See the Pen header-rotation-2 by Charles Pepe-Ranney (@chuckpr) on CodePen.

The text positioning happens in the transform attribute which is set to "translate(25,60)rotate(-45)". This transform is rotating the text by -45˚ and moving it down 60 units and right 25 units. Note how "down" and "up" are relative to the orientation of the text after rotation not the orientation of the viewbox.

Our column header element th will contain a div and this div will hold the SVG that defines our table header text. The trick is to position the SVG inside the div so it's flush with the table header cell by adding position: absolute; left: 0, top: 0 styles to the SVG. You also have to set a height on the table header cell that works with your SVG transform (in this case height: 65px). Essentially this lets our SVG position the text in the table header element. In the example below the gray background shows the bounds of our table header cell and the red border surrounds our SVG.

%%html
<p class="codepen" data-height="265" data-theme-id="light" data-default-tab="html,result" data-user="chuckpr" data-slug-hash="yLYXdWQ" style="height: 265px; box-sizing: border-box; display: flex; align-items: center; justify-content: center; border: 2px solid; margin: 1em 0; padding: 1em;" data-pen-title="header-rotation-3">
  <span>See the Pen <a href="https://codepen.io/chuckpr/pen/yLYXdWQ">
  header-rotation-3</a> by Charles Pepe-Ranney (<a href="https://codepen.io/chuckpr">@chuckpr</a>)
  on <a href="https://codepen.io">CodePen</a>.</span>
</p>
<script async src="https://static.codepen.io/assets/embed/ei.js"></script>

See the Pen header-rotation-3 by Charles Pepe-Ranney (@chuckpr) on CodePen.

The rest of the table header is pretty straightforward: left-aligned, uppercase text. I made a minimal Jinja template for the header and we can see it in action below. I really like the rotated text — it's readable and allows for narrow heatmap cells because of the reduced width.

%%writefile header_template.tpl

<div class="polls2">
  <table>
    <thead>
      <tr>
        {% for c in cols[:5] %}
        <th><div>{{c}}</div></th>
        {% endfor %}
        <th class="rotate">
          <div>
            <svg width="82" height="82" style="max-width: none;">
              <text transform="translate(25,60)rotate(-45)" x="0" y="0">{{cols[5]}}</text>
              <line x1="0" y1="65" x2="25" y2="40" transform="translate(41.5,0)"></line>
            </svg>
          </div>
        </th>
        <th class="rotate">
          <div>
            <svg width="82" height="82" style="max-width: none;">
              <text transform="translate(25,60)rotate(-45)" x="0" y="0">{{cols[6]}}</text>
            </svg>
          </div>
        </th>
        {% for c in cols[7:] %}
        <th><div>{{c}}</div></th>
        {% endfor %}
      </tr>
    </thead>
  </table>
</div style="margin-bottom: 20px;">

<style>

div.polls2 {
    overflow: scroll;
    margin-top: 20px;
}

.polls2 table {
    font-family: 'helvetica neue', helvetica, sans-serif;
    font-size: 12px;
    font-weight: 500;
    border-collapse: collapse;
    border-spacing: 0;
}

.polls2 table thead tr {
    border-bottom: 1px solid #222;
}

.polls2 table thead tr th {
    text-transform: uppercase;
    font-weight: 500;
    vertical-align: bottom;
    text-align: left;
    min-width: 60px;
}

.polls2 table thead tr th.rotate {
    height: 65px;
    width: 41px;
    padding: 0;
    position: relative;
}

.polls2 table thead tr th.rotate>div {
    position: absolute;
    left: 0;
    top: 0;
}

.polls2 table thead tr th.rotate>div svg line {
    stroke-width: 1;
    stroke: #cdcdcd;
}

</style>
Overwriting header_template.tpl
from IPython.display import HTML, display
from jinja2 import Template

with open('header_template.tpl') as file_:
    template = Template(file_.read())
    
cols = ['dates', 'pollster', 'grade', 'sample', 'weight', 'republican', 
        'democrat', 'leader', 'adjusted leader']
    
html = template.render(cols=cols)
HTML(html)
Out[14]:
dates
pollster
grade
sample
weight
republican
democrat
leader
adjusted leader

Bonus: making the "signal strength" icon

The "signal strengh" icon in the weight column is a nice touch. This icon can be created simply with div elements. Here's how I did it:

%%html
<p class="codepen" data-height="265" data-theme-id="light" data-default-tab="html,result" data-user="chuckpr" data-slug-hash="QWjgVWV" style="height: 265px; box-sizing: border-box; display: flex; align-items: center; justify-content: center; border: 2px solid; margin: 1em 0; padding: 1em;" data-pen-title="wireless-signal">
  <span>See the Pen <a href="https://codepen.io/chuckpr/pen/QWjgVWV">
  wireless-signal</a> by Charles Pepe-Ranney (<a href="https://codepen.io/chuckpr">@chuckpr</a>)
  on <a href="https://codepen.io">CodePen</a>.</span>
</p>
<script async src="https://static.codepen.io/assets/embed/ei.js"></script>

See the Pen wireless-signal by Charles Pepe-Ranney (@chuckpr) on CodePen.

%%writefile fivethirtyeight.tpl

<div class="polls3">
  <table>
    <thead>
      <tr>
        {% for c in cols[:5] %}
        <th><div {% if c.lower() in ('grade', 'sample', 'weight') %}
                   style="text-align: center"
                 {% endif %}>
          {{c}}</div>
        </th>
        {% endfor %}
        <th class="rotate">
          <div>
            <svg width="82" height="82" style="max-width: none;">
              <text transform="translate(25,60)rotate(-45)" x="0" y="0">{{cols[5]}}</text>
              <line x1="0" y1="65" x2="25" y2="40" transform="translate(41.5,0)"></line>
            </svg>
          </div>
        </th>
        <th class="rotate">
          <div>
            <svg width="82" height="82" style="max-width: none;">
              <text transform="translate(25,60)rotate(-45)" x="0" y="0">{{cols[6]}}</text>
            </svg>
          </div>
        </th>
        <th style="width: 20px"></th>
        {% for c in cols[7:] %}
        <th><div>{{c}}</div></th>
        {% endfor %}
      </tr>
    </thead>
    <tbody>
      {% for row in rows %}
      <tr>
        <td class="dates">{{row.dates}}</td>
        <td class="justtext">{{row.pollster}}</td>
        <td class="grade">
        {% if row.grade %}
          <div class="grade-circ" style="border-color: green">
            <div style="position: relative; top: 25%">{{row.grade}}</div>
          </div>
        {% else %}
          <div class="grade-circ" style="border-color: white"></div>
        {% endif %}    
        </td>
        <td class="sample">
          <div class="sample-number">
            {{row.sample}}<span style="color: #999">&nbsp{{row.population}}</span>
          </div>
        </td>
        <td class="weight">
          <div class="signal">
            <div class="bar" 
             style="height: 20%; background: {% if row.weight < 0.01 %}lightgray{% else %}gray{% endif %}">
            </div>
            <div class="bar" 
             style="height: 40%; background: {% if row.weight < 0.044 %}lightgray{% else %}gray{% endif %}">
            </div>
            <div class="bar" 
             style="height: 60%; background: {% if row.weight < 0.087 %}lightgray{% else %}gray{% endif %}">
            </div>
            <div class="bar" 
             style="height: 80%; background: {% if row.weight < 1.3 %}lightgray{% else %}gray{% endif %}">
            </div>
            <div class="bar" 
             style="height: 100%; background: {% if row.weight < 1.7 %}lightgray{% else %}gray{% endif %}">
            </div>
          </div>
          <div style="margin-right: 10px">{{row.weight}}</div>
        </td>
        <td class="heat">
          <div style="background-color: {{row.r_color}}">{{row.republican}}</div>
        </td>
        <td class="heat">
          <div style="background-color: {{row.d_color}}">{{row.democrat}}</div>
        </td>
        <td style="min-width: 10px;"></td>
        <td class="just-text">{{row.leader}}</td>
        <td class="adj-leader" 
            style="color: {% if row.adj_leader.startswith('D') %}#008fd5{% else %}#ff9371{% endif %}">
          {{row.adj_leader}}
        </td>
      </tr>
      {% endfor %}
    </tbody>
  </table>
</div>

<style>

div .polls3 {
    overflow: scroll;
    margin-top: 6px;
}

.polls3 table {
    font-family: 'helvetica neue', helvetica, sans-serif;
    font-size: 12px;
    font-weight: 500;
    border-collapse: collapse;
    border-spacing: 0;
}

.polls3 table thead tr {
    border-bottom: 1px solid #222;
}

.polls3 table thead tr th {
    text-transform: uppercase;
    font-weight: 500;
    vertical-align: bottom;
    text-align: left !important;
}

.polls3 table thead tr th.rotate {
    height: 65px;
    width: 41px;
    padding: 0;
    position: relative;
}

.polls3 table thead tr th.rotate>div {
    position: absolute;
    left: 0;
    top: 0;
}

.polls3 table thead tr th.rotate>div svg line {
    stroke-width: 1;
    stroke: #cdcdcd;
}

.polls3 table tbody tr td {
    vertical-align: middle;
}

.polls3 table tbody tr td.dates {
    padding-left: 5px;
    min-width: 90px;
    font-size: 11px;
    text-transform: uppercase;
    color: #999;
    text-align: left;
}

.polls3 table tbody tr td.just-text {
    padding-left: 5px;
    min-width: 80px;
    font-size: 13px;
    text-align: left;
}

.polls3 table tbody tr td.grade {
    text-align: center;
    padding-left: 10px;
    border-right: 1px solid #222;
    width: 70px;
    min-width: 70px;
    font-size: 11px;
}

.polls3 table tbody tr td.grade>div {
    border: 2px solid;
    border-radius: 50%;
    height: 30px;
    width: 30px;
    font-weight: bold;
    margin-left: auto;
    margin-right: auto;
}

.polls3 table tbody tr td.sample {
    width: 65px;
    min-width: 65px;
    font-size: 13px;
    text-align: right;
    font-family: "DecimaMonoPro", monospace;
    margin-right: 5px;
    padding-left: 5px;
    text-transform: uppercase;
}

.polls3 table tbody tr td.weight {
    font-size: 13px;
    text-align: right;
    font-family: "DecimaMonoPro", monospace;
    width: 90px;
    min-width: 90px;
    border-right: 1px solid #222;
    text-transform: uppercase;
    padding-left: 5px;
}

.signal {
    width: 35px;
    height: 18px;
    margin: 0;
    padding: 0;
    display: table;
    float: left;
}

.bar {
    margin-left: 5%;
    padding: 0;
    vertical: align-bottom;
    width: 12%;
    display: inline-block;
}

.polls3 table tbody tr td.heat {
    padding: 0;
}

.polls3 table tbody tr td.heat>div {
    width: 40px;
    min-width: 40px;
    height: 50px;
    font-family: "DecimaMonoPro", monospace;
    font-size: 13px;
    display: table-cell;
    vertical-align: middle;
    text-align: center;
}

.polls3 table tbody tr td.adj-leader {
    width: 65px;
    min-width: 65px;
    font-weight: 700;
    font-size: 13px;
    text-align: left;
    padding-left: 5px;
}

</style>
Overwriting fivethirtyeight.tpl