This week, we talk to Dr. Chansoo Kim, research scientist at the Computational Science Center in the Korea Institute of Science and Technology (KIST). He provides insights on how data and simulation models can be used to track and predict the COVID-19 spread, what effect possible mitigation strategies can have on transmission patterns, as well as how his research informs government policies in Korea.
Especially for those working in the simulation research field, in the early stage of the COVID-19 epidemic, many groups performed research in a more independent way. As the situation got worse the Korean Ministry of Science and ICT (MSICT) tried to organise a group to support access to data from central and local governments and cellular service providers. This joined-up approach is not compulsory and rather autonomous and voluntary, so each group’s independence is highly respected and they are able to do their own analyses.
Our group has been focusing on macroscale simulations on the disease transmission, using complex system concepts and statistical mechanics, which include individual-based/space-based as well as data-driven approaches. Using data-driven approaches including statistical analyses, which are based on a concept of `distributions’, inferences and deep learnings focusing on influences of the outliers, we obtained parameters, e.g. (initial) disease transmission rate, basic reproduction number, interaction strength, (initial) death rate, social characteristics for performing simulations. We lean on the complex system to build our simulation scheme: it is similar to many-body systems in Statistical Physics that describe which situation emerges among many billiard balls using the information about the interaction between two (arbitrary) balls.
Our simulation tool is called KIST (KIST’s Individual-based Simulation Toolkit for Transfer phenomena). This allows us to see how individuals’ movement can influence the disease spread. Our tool includes almost fifty million people in Korea with geographical information, which is the same as the real number of the Korean population.
We have been supported by my home institute, KIST (Korea Institute of Science and Technology) and MSICT for this subject. We believe the relatively long-term support was effective and important. Without that, it would had been difficult to develop, maintain and improve a sort-of simulation platform in our group and it would not had been easy to produce various simulation results for COVID-19.
Our research has been focusing on how the disease spreads amongst society, what could be some optimal strategies to mitigate it and what changes would occur under those mitigation policies. Therefore, our results have been utilized for governmental institutions such as MSICT, the Central Disaster and Safety Countermeasures Headquarters, and Cabinet meetings to decide on the policies to follow based on robust scientific evidences.
Even though our research includes making a good guess about when the pandemic will end, we rather focus on obtaining results on the efficacy and effect that different policies can have. We then utilize those results to decide which one is more applicable to our society and provide scientific reason to support our claims to the government. Many people even research communities are only interested in correctly guessing when the pandemic will end, but we believe that a series of simulations has more meaningful significance in showing how the society would have a different result in how the disease spread develops if a specific policy was chosen.
Due to the characteristics of simulations and analysis, it is also important to decide which data to use, including disease information such as the infection rate, which cannot be easily obtained in the early stages of the virus spread. For example, even R0, the basic reproduction number, is very socially-dependent and includes not only disease information but also social meaning, so it requires heavy social analysis. In other words, R0 itself is clearly not enough to show the whole characteristics of COVID-19, so we need more factors as well as data.
As an example, for the simulation status of each individual person (not a group of people), we may choose S-I-R (Susceptible-Infected-Recovered) or S-L-I-R (Latent) or S-L-I-Q-R (Quarantined), and this decision bears various questions such as available data, degree of data and proxy of data.
Ironically, this severe COVID-19 pandemic proved that we need a long-term support for the simulations research. It also demonstrates the reliability of simulations a bit more clearly and shows that we cannot predict the future in full accuracy but only estimate the effects that given policies can have.
We need to better explain how ‘micro-motivation’ can change ‘macro-behavior’ (this is homage to Prof. Thomas Schelling, a Nobel Economics laureate)- how ‘one’ person, who tries to avoid meeting people in a crowded place, can mitigate the spread situation. As experts, we try to not only provide simulation results to the public as a `signaling’, but also advertise a simple but meaningful diagram.
Countries such as the UK are utilizing simulation tools to make firm policy decisions and warn people based on the science. Our results have also been used by policymakers as well as introduced in major newspapers, which worked as a public signaling tool.
Diagrams and animations are introduced in social networks such as youtube under our collaboration with MSICT.
Simple diagrams/animations definitely help communicate our findings to the public:
Case 1A. Conceptual diagram: Normal social contacts. (CK)
Case 1B. Conceptual diagram: Social distancing – just one person is out of the line. (CK)
We also show a simple simulated society with a graph showing the number of infected people (time in abscissa and ordinate in the number).
Case 2A. Conceptual diagram: Normal social contacts. (CK)
Case 2B. Conceptual diagram: Social distancing (CK)
"We need to better explain how ‘micro-motivation’ can change ‘macro-behavior’..."
Our simulation results have been utilized to establish, assess and estimate the government policies such as social distancing, overseas inflow control and planning school opening schedules. Since our simulation tool is based on individuals, we are able to impose an individual-based policy. In the simulation world, we can tweak individuals actions.
For social distancing, we ask (in the simulation) individuals to reduce the number of people, who they meet in a day, to an average of 2-4 daily encounters from 7-10, which is the normal average number. In the simulation world, we change their overall compliance rate, then it naturally generates various situations.
One of our simulations compares the effects of wearing masks and washing hands. Since each individual has different action patterns, COVID-19 spread is shown differently.
We also change the degree of control on overseas inflow as proposed policy schemes: how strong we ask the infected people coming from overseas to quarantine themselves. Of course, to provide more detail, the policy is implemented by applying IT-based tracking system to them.
For Korea, it was of importance to decide when we reopen schools. We have a kindergarten-elementary-middle-high school system, in which students attend schools by age. Our individual-based simulation has 'age' as an attribute of each individual, we apply different school opening dates for each. We ask (simulation) high-, elementary- and middle- schoolers to attend by every two/three weeks, which was the real policy in Korea. For example, in the simulation world, high-schoolers start attending school at some point. Elementary schools and Kindergartens open their sites two weeks later. Middle school students begin attending after two more weeks again. We have simulated the suggested policy scheme before a decision was made with MSICT.
"Our simulation results have been utilized to establish, assess and estimate the government policies such as social distancing, overseas inflow control and planning school opening schedules."
We believe data is very important, especially for simulation studies. Of course, the information shared and open data is useful in (1) modifying and expanding the simulation model, (2) putting more data into the simulation, (3) fitting and matching the simulation with the real data. However, more data does not guarantee better results.
Also, private information such as the movement patterns of infected people is not so helpful in regards to forecasting and estimating, because it is ‘past’ data. It is not guaranteed that people would move in the same way as before. Our toolkit has introduced randomness, albeit based on the real big-data, to overcome this issue.
For simulations, we need data about human traffic. Private data is not strongly required, but open data by the government, which is at the level of a distribution or histogram, are very important.
Our group is in various research societies for COVID-19. We share data, especially the number of infected, dead and recovered people with spatio-temporal stamps such as county/city and day. This can help us in (1) fitting our simulation results, which have spatio-temporal information, to the real data and (2) obtaining interaction parameters among people by analyzing the data.
All our interviews reflect the views and opinions of the interviewees.
Chansoo Kim is a research scientist as well as an economist at the Computational Science Centre in Korea Institute of Science and Technology (KIST) in Republic of Korea. He leads the group ‘Information Machine Learning Financial Econ Lab’, which works on the science of information and complex (adaptive) systems.
His research focuses on heavy-tailed distributions such as leptokurtic distribution, which are among the non-Gaussian behaviours, and its applications to learning, reasoning, finance as well as inequality. One of his research and academic ancestors is C. F. Gauß. He has been working hard to be a good scientist, who stands under the intellectual tradition of Neruda, Szymborska, Lacan, Levinas, Hirofumi, Keynes, Mandelbrot, Boulding, Prigogine, Shannon and Boltzmann.
For the COVID-19 pandemic, his group has performed a series of macroscale individual-based simulations for policy development and assessments with the Korean government using parallel computers. He obtained his B.Sc. in Computer Science from Seoul National University (2003) and M.A. focusing on Statistical Physics at Massachusetts Institute of Technology (2008), and his PhD in Economics (Mathematical Finance and Machine Learning) comes from Seoul National University (2020).