Build a web data dashboard — in just a few lines of Python code
I don’t know about you, but I occasionally find it a little bit intimidating to have to code something. This is doubly so when I’m building something akin to web development rather than doing some local data analysis and visualisation. I’m a competent Python coder, but I wouldn’t call myself a web developer at all, even after having more than dabbled with Django and Flask.
Still, converting your data outputs to a web app leads to a few non-trivial improvements for your project.
It is just much easier to build in true, powerful interactivity into a web app. It also means that you can control exactly how the data is presented, as the web app can become the de facto report as well as the access point to your data. Lastly, and most importantly, you can exponentially scale the accessibility to your outputs; making them available anywhere, any time. There is always a web browser at a user’s fingertips.
So, I bit the bullet and started to do just this with some of my data projects recently, with surprisingly fast speed and efficiency. I converted one of my outputs from this article to a web app here in just a couple of hours.
My NBA analytics web app (link)
I thought this was rather cool, and wanted to share how this came together in just a few lines of code.
As always, I include everything you need to replicate my steps (data & code), and the article is not really about basketball. So do not worry if you are unfamiliar with it, and let’s get going.
Before we get started
Data
I include the code and data in my GitLab repo here (dash_simple_nbadirectory). So please feel free to play with it / improve upon it.
Packages
I assume you’re familiar with python. Even if you’re relatively new, this tutorial shouldn’t be too tricky, though.
You’ll need pandas, plotly and dash. Install each (in your virtual environment) with a simple pip install [PACKAGE_NAME].
Previously, on Python…
For this tutorial, I am simply going to skip *most* of the steps taken to create the local version of our visualisation. If you’re interested in what is going on, take a look at this article:
We will have a recap session, though, so you can see what is happening between plotting the chart locally with Plotly, and how to port that to a web app with Plotly Dash.
Load data
I have pre-processed the data, and saved it as a CSV file. It is a collection of player data for the current NBA season (as of 26/Feb/2020), which shows:
- What share of their team’s shots they are taking, and
- How efficient / effective they are at doing it.
For this portion, follow along by opening local_plot.py in my repo.
Load the data with:
all_teams_df = pd.read_csv(‘srcdata/shot_dist_compiled_data_2019_20.csv’)
Inspect the data with all_teams_df.head(), and you should see:
The dataframe contains all NBA players, so let’s break it down to a manageable size, by filtering for a team. For instance, the New Orleans Pelicans’ players can be chosen with:
all_teams_df[all_teams_df.group == 'NOP']
Then, our data can be visualised in Plotly, as below:
import plotly.express as px fig = px.scatter(all_teams_df[all_teams_df.group == 'NOP'], x='min_mid', y='player', size='shots_freq', color='pl_pps') fig.show()
Visualised player data for New Orlean Pelicans
At the risk of doing this:
How to Draw a Horse — Van Oktop (Tweet)
I do add a few small details to my chart, to produce this version of the same graph.
Same chart, with a few ‘small details’ added (& different team).
This is the code that I used to do it.
Now, while it’s a lot of formatting code, I thought it useful to show you how I did it, because we are going to be re-using these functions in our Dash version of the code.
def clean_chart_format(fig): import plotly.graph_objects as go fig.update_layout( paper_bgcolor="white", plot_bgcolor="white", annotations=[ go.layout.Annotation( x=0.9, y=1.02, showarrow=False, text="Twitter: @_jphwang", xref="paper", yref="paper", textangle=0 ), ], font=dict( family="Arial, Tahoma, Helvetica", size=10, color="#404040" ), margin=dict( t=20 ) ) fig.update_traces(marker=dict(line=dict(width=1, color='Navy')), selector=dict(mode='markers')) fig.update_coloraxes( colorbar=dict( thicknessmode="pixels", thickness=15, outlinewidth=1, outlinecolor='#909090', lenmode="pixels", len=300, yanchor="top", y=1, )) fig.update_yaxes(showgrid=True, gridwidth=1, tickson='boundaries', gridcolor='LightGray', fixedrange=True) fig.update_xaxes(showgrid=True, gridwidth=1, gridcolor='LightGray', fixedrange=True) return True def make_shot_dist_chart(input_df, color_continuous_scale=None, size_col='shots_count', col_col='pl_acc', range_color=None): max_bubble_size = 15 if color_continuous_scale is None: color_continuous_scale = px.colors.diverging.RdYlBu_r if range_color is None: range_color = [min(input_df[col_col]), max(input_df[col_col])] fig = px.scatter( input_df, x='min_mid', y='player', size=size_col, color=col_col, color_continuous_scale=color_continuous_scale, range_color=range_color, range_x=[0, 49], range_y=[-1, len(input_df.player.unique())], hover_name='player', hover_data=['min_start', 'min_end', 'shots_count', 'shots_made', 'shots_freq', 'shots_acc', ], render_mode='svg' ) fig.update_coloraxes(colorbar=dict(title='Points per<BR>100 shots')) fig.update_traces(marker=dict(sizeref=2. * 30 / (max_bubble_size ** 2))) fig.update_yaxes(title="Player") fig.update_xaxes(title='Minute', tickvals=list(range(0, 54, 6))) return fig fig = make_shot_dist_chart( all_teams_df[all_teams_df.group == 'SAS'], col_col='pl_pps', range_color=[90, 120], size_col='shots_freq') clean_chart_format(fig) fig.update_layout(height=500, width=1250) fig.show()
Now, let’s get to the main event — how to create a web app out of these plots.
Into the World Wide Web
You can read more about Plotly Dash here, but for now all you need to know that it is an open-source software package developed to abstract away the difficulties in putting your visualisations on the web.
It works with Flask under the hood, and you can happily reuse most of the code that you used to develop plots in plotly.py.
This is the simple version that I put together:
import pandas as pd import dash import dash_core_components as dcc import dash_html_components as html from dash.dependencies import Input, Output all_teams_df = pd.read_csv('srcdata/shot_dist_compiled_data_2019_20.csv') app = dash.Dash(__name__) server = app.server team_names = all_teams_df.group.unique() team_names.sort() app.layout = html.Div([ html.Div([dcc.Dropdown(id='group-select', options=[{'label': i, 'value': i} for i in team_names], value='TOR', style={'width': '140px'})]), dcc.Graph('shot-dist-graph', config={'displayModeBar': False})]) @app.callback( Output('shot-dist-graph', 'figure'), [Input('group-select', 'value')] ) def update_graph(grpname): import plotly.express as px return px.scatter(all_teams_df[all_teams_df.group == grpname], x='min_mid', y='player', size='shots_freq', color='pl_pps') if __name__ == '__main__': app.run_server(debug=False)
Try it out! It should open this plot on your browser.
Our first Dash app!
What’s the big deal? Well, for one — it is a live web app, in under 25 lines of code. And notice the drop-down menu on the top left? Try changing the values on it, and watch the graph change *magically*.
Go on, I’ll wait.
Okay? Done.
Let’s briefly go through the code.
At a high level, what I’m doing here is to:
- Initialise a Dash app;
- Get a list of available team names, and provide it to a dropdown menu (with DOM id group-select) with a default value or ‘TOR’;
- Instantiate a Graph object as the shot-dist-graph identifier within Dash; and
- Create a callback function where if any of the values are changed, it will call the update_graph function and pass the returned object to the Output.
If you take a look at the code, so many of what is probably trivial for web devs but annoying for me is abstracted away.
dcc.Graph wraps the figure object from plotly.py into my web app and HTML components like divs can be called and set up conveniently with html.Div objects.
Most gratifying for me personally is that Input objects and callbacks from those inputs are declaratively set up, and I can avoid having to deal with things like HTML forms or JavaScript.
And the resulting app still works beautifully. The graph is updated the moment that the pulldown menu is used to select another value.
And we did all that in fewer than 25 lines of code.
Why Dash?
At this point, you might be asking — why Dash? We can do all this with a JS framework front end, and Flask, or any one of myriad other combinations.
To someone like me who prefers the comfort of Python than natively dealing with HTML and CSS, using Dash abstracts away a lot of stuff that doesn’t add a lot of value to the end product.
Take, for instance, a version of this app that includes further formatting and notes for the audience:
(It is simple_dash_w_format.py in the git repo)
def clean_chart_format(fig): fig.update_layout( paper_bgcolor="white", plot_bgcolor="white", annotations=[ go.layout.Annotation( x=0.9, y=1.02, showarrow=False, text="Twitter: @_jphwang", xref="paper", yref="paper", textangle=0 ), ], font=dict( family="Arial, Tahoma, Helvetica", size=10, color="#404040" ), margin=dict( t=20 ) ) fig.update_traces(marker=dict(line=dict(width=1, color='Navy')), selector=dict(mode='markers')) fig.update_coloraxes( colorbar=dict( thicknessmode="pixels", thickness=15, outlinewidth=1, outlinecolor='#909090', lenmode="pixels", len=300, yanchor="top", y=1, )) fig.update_yaxes(showgrid=True, gridwidth=1, tickson='boundaries', gridcolor='LightGray', fixedrange=True) fig.update_xaxes(showgrid=True, gridwidth=1, gridcolor='LightGray', fixedrange=True) return True def make_shot_dist_chart(input_df, color_continuous_scale=None, size_col='shots_count', col_col='pl_acc', range_color=None): max_bubble_size = 15 if color_continuous_scale is None: color_continuous_scale = px.colors.diverging.RdYlBu_r if range_color is None: range_color = [min(input_df[col_col]), max(input_df[col_col])] fig = px.scatter( input_df, x='min_mid', y='player', size=size_col, color=col_col, color_continuous_scale=color_continuous_scale, range_color=range_color, range_x=[0, 49], range_y=[-1, len(input_df.player.unique())], hover_name='player', hover_data=['min_start', 'min_end', 'shots_count', 'shots_made', 'shots_freq', 'shots_acc', ], render_mode='svg' ) fig.update_coloraxes(colorbar=dict(title='Points per<BR>100 shots')) fig.update_traces(marker=dict(sizeref=2. * 30 / (max_bubble_size ** 2))) fig.update_yaxes(title="Player") fig.update_xaxes(title='Minute', tickvals=list(range(0, 54, 6))) return fig app.title = 'Dash Demo - NBA' team_names = all_teams_df.group.unique() team_names.sort() app.layout = html.Div([ html.Div([ dcc.Markdown( """ #### Shot Frequencies & Efficiencies (2019-20 NBA Season) This page compares players based on shot *frequency* and *efficiency*, divided up into minutes of regulation time for each team. Use the pulldown to select a team, or select 'Leaders' to see leaders from each team. *Notes*: * **Frequency**: A team's shots a player is taking, indicated by **size**. * **Efficiency**: Points scored per 100 shots, indicated by **colour** (red == better, blue == worse). * Players with <1% of team shots are shown under 'Others' """ ), html.P([html.Small("See more data / NBA analytics content, find me on "), html.A(html.Small("twitter"), href="https://twitter.com/_jphwang", title="twitter"), html.Small("!")]), ]), html.Div([ dcc.Dropdown( id='group-select', options=[{'label': i, 'value': i} for i in team_names], value='TOR', style={'width': '140px'} ) ]), dcc.Graph( 'shot-dist-graph', config={'displayModeBar': False} ) ]) @app.callback( Output('shot-dist-graph', 'figure'), [Input('group-select', 'value')] ) def update_graph(grpname): fig = make_shot_dist_chart( all_teams_df[all_teams_df.group == grpname], col_col='pl_pps', range_color=[90, 120], size_col='shots_freq') clean_chart_format(fig) if len(grpname) > 3: fig.update_layout(height=850, width=1250) else: fig.update_layout(height=500, width=1250) return fig
Most of the changes are cosmetic, but I will note that here, I just write the body text in Markdown, and simply carry over my formatting functions from Plotly to be used in the formatting the graphs in Dash.
This saves me a tremendous amount of time between doing data analysis and visualisation to deployment to clients’ views.
All in all, from starting with my initial graph, I think it probably took less than an hour to deploy it to Heroku. Which is pretty amazing.
I will get into more advanced features of Dash, and actually doing some cool things with it functionality-wise, but I was very happy with this outcome in terms of ease and speed.
Try it out yourself — I think that you’d be very impressed. Next time, I plan to write about some really cool things you can do with Dash, and building truly interactive dashboards.