TL;DR: Here’s our project that we cranked out in 53 hours from conception to deployment.
The novel Coronavirus, responsible for the disease “coronavirus disease 2019” or “COVID-19”, is an ongoing global pandemic affecting the lives of virtually every person on Earth. In an effort to slow down the rate of infections in our local communities and as a result the global community, public officials around the world are placing orders and mandates for bars to close, restaurants to offer only take-out or deliveries, and any non-essential member of the public to stay home (i.e. anyone not directly involved in the effort of mitigating the spread, tending to the sick, and feeding the people sheltering at home).
With all these public declarations of “stay calm and stay home” while at the same time being inundated with 24/7 coverage of the virus from not only the television but also the social media news feeds, this begs the following question:
Our public officials are making us stay home.
How effective is it?
This was the question we wanted to tackle for the Pandemic Response Hackathon. You won’t find our project in the submissions (mainly because we didn’t finish in time), which is why I decided to detail it here for the benefit of each of our team members’ efforts.
What was our solution and what does it do?
Our solution was a live updated graph. Not a dashboard (like a lot of tech creations coming out of this crisis*), but some sort of almost-live feedback visualization for state and local governments to show how effective was their message to their citizens (as it turns out by this video in Los Angeles, not very).
We had a few directions in mind when it came to what to plot. For this hackathon we started with plotting the confirmed cases over time, with the confirmed cases displayed as a semi-log series. There would then be a vertical bar overlay detailing what sort of public action the state government took on a certain day. This would be displayed for each state, selectable by a simple dropdown. Essentially, this plot conveys the correlation of implemented public policies to the rate of confirmed infections.
*Not to diminish dashboards—some of them are quite complex and beautiful, but with how abundant the data is, it’s really easy for one to fall into despair about how bad things are.
How we built it
Here is our source data of infections over time, and here is our source data of policy actions addressing the virus, by state.
Our calculations were created in Python. The Jupyter Notebooks environment and data science library Pandas were used extensively in designing the data manipulation and calculation with the resulting algorithms implemented as a function. We scraped the policy actions page using Beautiful Soup and processed the data, grouping policies by state and reducing policies into policy types.
Our architecture is serverless. We utilized the AWS Lambda service—our calculations are triggered as a cron job, scheduled for once a day, every day. The resulting generated data is then pushed to the frontend via Now. The frontend is built with the Gatsby Framework. From there the data is displayed using the Recharts charting library.
Challenges I ran into
For some of us, this was our first hackathon. For all of us, this was our first remote hackathon. The first challenge we had to overcome was coordinate goals, roles, and responsibilities remotely. This meant that other than the initial brainstorm meeting via Zoom, we had no idea how our teammates were doing—whether they were encountering issues with setting up the server or were stuck at a GitHub integration step because they didn’t have the right permissions or even if they were physically AFK and outside on a run (it’s one of the few dwindling methods of staying sane these days). The only way of knowing how everyone was doing was via our Slack channel, and it looked like everyone handled their duties and responsibilities quite well.
Accomplishments that I'm proud of
I am proud of not only contributing my skills to the global effort in trying to figure out how to mitigate this global pandemic, but even more so putting together the awesome team that I had. By extension, I am proud of the resulting project that came out of this effort.
Who built this
Dean Lalap - Software Engineer at Medtronic
Erik Uggeldahl - Developer Relations Manager at You.i TV
Masa Maeda - Automation Engineer at Taco Bell
Michelle Yuan - Software Engineer at Google
What We learned
Erik: “I learned Gatsby has a very roundabout way of injecting data, its documentation is lacking in some areas, and that Now is still dope”
Masa: “I found this. It basically simulates the Github Actions environment so you can test your actions out without committing. Pretty sweet. Also learned how to use github actions, it took like 20 mins to get everything working.”
Dean: “pandas kicked my ass”
Michelle (To me): “RIP. you feelin my feels. pandas suck. jupyter is pretty tho”
What’s Next?
One plot we wanted to explore was the direct correlation between public policies implemented and deaths due to COVID-19. Simply going by confirmed cases over time misrepresents the situation such that it looks like no one is infected; as more tests are available and performed, we would see an increase in infections. However, this does not address the issue of individuals who actually have the sickness but are not tested.
Another plot we wanted to explore (actually, just me—Dean) was the logarithmic graph of confirmed cases over the past seven days with respect to the total cumulative confirmed cases as showcased in this MinutePhysics video “How To Tell If We’re Beating COVID-19,” but then add our public declaration vertical line event overlay. We would need to implement our numbers on a global basis rather than a state-by-state basis. Yes, this would be a lot of work, including scraping Coronavirus-related headlines on a global scale which would introduce the challenge of scraping relevant but non-English content. On top of that, this would fall into the bias pit of “confirmed infections does not mean all infections” thus rendering this dataset incomplete. Implementing the death count may be a better idea.
In any case, given the current circumstances of being stuck—err, safe at home until at least the end of April and the looming threat of a spike in mortality rate, we may or may not push a few more git changes.