Covid-19 and data science: Modelling a pandemic

By Mark Lambrecht

According to media reports, analysed data for South Africa suggests that the current wave of infections (wave 3) is expected to end around the end of August 2021, but that estimates indicate the fourth wave could start around 2 December – comparable to wave 3 with an expected new variant of the virus by then and lasting about 75 days. This kind of reporting demonstrates how data science has played a crucial role in managing the impact of the COVID-19 pandemic.

The practical application of theoretical knowledge during these crisis times has resulted in everything from improving situational awareness to conducting epidemiological modelling, performing network analytics, and even holding randomised controlled trials. It remains important to keep using different approaches and constantly incorporate new information so that forecasts are always as accurate as possible. The appearance of new mutation virus variants is notably difficult to predict and could impact the assumptions of each model – pointing at the need of an agile data science environment that can robustly work with new information.

It has become evident that one data set often reveals a piece of the puzzle. But when linking multiple data sources, a much bigger narrative becomes visible. It is this data linkage that becomes critical to also helping understand social determinants of health, amongst others, which inform effective policy and intervention. Simply put, data and its analysis have made a discernible difference in how society reacts to this crisis. It is helping governments and healthcare organisations around the world find more effective ways to deliver better outcomes.

If anything, the pandemic has highlighted the need to have access to even more data and advanced technologies that can analyse that data as effectively and quickly as possible. The reason for this is clear even as South Africa has recently seen the decline of the third wave of infections – there remains continued concern about shortages in hospital resources, including hospital beds, ventilators and personal protective equipment.

Data science, and the associated modelling, can assist government health organisations and hospitals predict patient hospitalisation rates and plan resources accordingly. Throughout this, partnerships between public and private sector entities are crucially important. The former can provide an enabling environment to affect change while the latter can deliver on the technology and data analytics requirements critically important to inform decision-making processes.

Yes, governments hold much of the critical data needed to understand current conditions during an outbreak. However, it is analytics that provide them with the ability to synthesise this data with other non-health (social indicators) and non-governmental data to get the most insights from this unified data. Additionally, analytics can deliver insights about the spread of a disease and the effectiveness of public health action, which can improve the response.

Of course, analysing data is a complex process that is significantly impacted by the rate at which it [the data] is growing especially given how quickly the pandemic spread globally. Given the life and death scenario that has played itself out numerous times over the past +/- 18-months, data analysis must be held to a higher standard to decipher it and glean insights to stem the flow of the pandemic. Therefore, to overcome the crisis, much comes down to collecting data, using it to help understand more about how COVID-19 spreads, and harnessing insights to prevent or limit the effects of future pandemics.

Epidemiologists track disease outbreaks using statistics. The likes of cumulative frequency graphs and exponential growth curves have been shared widely to help visualise the growth of the disease and understand when the growth might be peaking. And then mapping dashboards show where cluster outbreaks are occurring. This empowers government and health officials to develop area-specific models and dashboards that help allocate critical healthcare resources. This can also significantly assist when it comes to region-specific lockdown regulations and highlight hot zones where infections are high.

The pandemic has shown how real-world data science entails being part of interdisciplinary teams. It is not about the beauty of an algorithm or the choice of programming language. It is about having a concerted will, a unified mission, and a common goal to analyse data across environments to help improve responses and provide guidance on where best to allocate resources.

(Mark Lambrecht, Director of Global Health and Life Sciences Practice at SAS).


Be the first to comment

Leave a Reply

Your email address will not be published.


This site uses Akismet to reduce spam. Learn how your comment data is processed.