The Value of Big Data in a Pandemic

Kairong Xiao
May 05, 2021



This article summarizes a study of the economic and public health effects of the Health Code app in China. By exploiting the staggered implementation of this technology across 322 Chinese cities, this study finds that the Health Code app significantly reduced virus transmission and facilitated economic recovery during the COVID-19 pandemic. A macroeconomic susceptible-infectious-recovered (SIR) model calibrated to the micro-level estimates shows that the technology reduced the economic loss by 0.5% of GDP and saved more than 200,000 lives by alleviating informational frictions during the COVID-19 outbreak.

Informational frictions are a key reason that the COVID-19 pandemic has developed into a global economic crisis. Because COVID-19 patients are often asymptomatic for an extended period during which the virus can be unknowingly transmitted, governments have been forced to shut down the whole economy to stop hidden carriers of the virus. Informational frictions also create fear among the public, causing a collapse of in-person economic activity.

Big data offers a promising solution to informational frictions. By leveraging the enormous amount of data produced in our digital age, big data technologies can provide a fast and cost-effective way to detect infections. Therefore, governments and private institutions around the world have experimented with numerous big data technologies, resulting in many technological advancements and rich data resources. These experiments can provide important lessons on the value of big data technologies in addressing socioeconomic problems caused by informational frictions.

A study by Xiao (2021) studies the economic and public health impacts of the Health Code app during the COVID-19 outbreak in China. This app uses mobile location and digital transaction records to conduct contact tracing and provide health certifications. If the app holder has not been in contact with any COVID-19 patients in the past 14 days, the app will generate a green QR code as shown in Figure 1. However, if a potential contact is detected, then the code will turn yellow or red, and the holder must self-quarantine for 7 to 14 days until the code turns green again. These colored QR codes also serve as digital health certificates in public spaces such as airports, railway stations, and restaurants, where checkpoints are often set up to regulate population flows.

Figure 1: Health Code

This paper studies the economic and public health impacts of the Health Code app by exploiting its staggered introduction across 322 Chinese cities. To better understand the empirical setting, consider the example of Hangzhou and Nanjing. On February 9, 2020, 17 days after Wuhan’s lockdown, Alipay, a FinTech company headquartered in Hangzhou, helped the Hangzhou municipal government develop the Hangzhou Health Code to replace manual contact tracing and physical health records. With the help of this app, millions of workers with green codes could return to work. Hangzhou’s economic activity, measured by daily greenhouse gas emissions, rose significantly after the launch of the Health Code app. Consumer confidence was also quickly restored as restaurants and shopping malls used this app to monitor their business sites’ health conditions. Rapid economic recovery did not bring a resurgence of infections. Instead, new COVID-19 cases dropped dramatically in Hangzhou. In comparison, a neighboring city, Nanjing, experienced little economic recovery but more infections during the same period.

This study shows that Hangzhou’s experience can be generalized to other places. Specifically, this study investigates high-frequency changes in cities’ economy and public health around the staggered implementation dates of the Health Code app across 322 Chinese cities using the event study techniques in Borusyak and Jaravel (2017). As shown in Figure 2, cities displayed no pre-trends in economic activity or infections before the Health Code launch. However, four weeks after the launch, the treated cities experienced a 24% increase in greenhouse gas emissions relative to the control cities. The rapid recovery in economic activity is visible in the satellite images of the nitrogen dioxide level over China produced by Google Earth, as shown in Figure 3.

Figure 2: Dynamic Effects of the Health Code Introduction on NO2 Emission

Figure 3: Satellite Images of NO2 Levels over China

One might worry that local governments launched the Health Code app when they decided to reopen the economy. Thus, the resumption in economic activity could be mechanically driven by the relaxation of the lockdown policy rather than the reduction of informational frictions. This study uses the daily intensity of each city’s population movement constructed from mobile location data to control for the strictness of lockdown policies. Furthermore, this study also uses local governments’ public health emergency levels to control for other disease control measures, such as social distancing.

More importantly, this alternative explanation cannot explain the decline in local infections after the introduction of the Health Code app. As shown in Figure 4, the resumption of economic activity did not bring a resurgence of infections. Instead, local infection growth dropped significantly. Population inflows from outbreak epicenters also led to fewer local cases after the app was introduced.

Figure 4: Dynamic Effects of the Health Code App Introduction on COVID-19 Infection Rates

One might worry that city officials decided to launch the Health Code app when they observed signals that local transmission was decreasing. Individual behavior such social distancing, hand washing, and face masking may also change differently across cities over time. To address this concern, this study employs an instrumental variable approach by exploiting the fact that the Health Code app was developed by FinTech firms and distributed through popular payment apps such as Alipay. Therefore, cities with higher FinTech penetration were likely to launch the Health Code app earlier than those with low FinTech penetration. This alternative empirical strategy yields similar results: cities that launched the Health Code app early achieved greater economic recovery and lower infection growth.

The study further constructs a macroeconomic susceptible-infectious-recovered (SIR) model following Alvarez, Argente, and Lippi (2021) to evaluate the welfare implications of the Health Code app. The model includes big data technology that generates signals on agents’ health status, a lockdown policy that reduces virus transmission by depressing agents’ activity, and a residual parameter that captures other behavioral changes such as hand washing and face masking. Calibrating the model using parameters identified from micro-level data, the model shows that big data technology helped prevent a new wave of infections after lockdowns were lifted, saving around 200,000 lives and 0.5% of the GDP in 2020. This result is striking because the Health Code app can only detect 53% of infections based on the estimation results, while typical viral and antibody tests have an accuracy rate above 95%. However, because these signals can be produced in real time and for millions of users at virtually no cost, this technology can significantly affect the aggregate economy even if the signal is only modestly accurate.

Some observers attribute China’s successful containment of the COVID-19 pandemic to its strict lockdown policies. To evaluate the Health Code app’s relative contribution to the lockdown policies, this study simulates a counterfactual economy in which lockdown policies were not implemented at the end of January 2020. In that case, the infections would have kept rising in the first two months of 2020 until the Health Code app and other disease control measures were rolled out. Overall, the strict lockdown policies saved 74,000 lives, but caused 1.1% of GDP loss in 2020. Comparing the effects of the lockdown policies with those of the Health Code app, this study finds that the app had a similar—if not greater—effect in containing the outbreak without inflicting steep economic costs.

Although big data technologies could be effective tools for alleviating informational frictions, they raise concerns about privacy infringement. Due to privacy concerns, many COVID-19 apps, such as those developed by Apple and Google, avoid linking the contact history to the user’s identity. Instead, they use a “private notification” model in which an anonymous message is sent to the holder when the user is exposed to the virus. Under this private notification model, public health authorities or private businesses cannot use the notification, or the absence thereof, to monitor human flows in public spaces. App users have full discretion over whether to self-quarantine after receiving the notification. The counterfactual simulation shows that making the signal privately observable can drastically change the outcomes, even when the signals’ accuracy remains constant. Under the assumption that 40% of the agents who receive bad signals decide to self-quarantine, the death toll would have been 16,000 higher than the baseline simulation. The economic value created by the Health Code app would decrease by 65%. This counterfactual exercise suggests that there is a trade-off between protecting privacy and resolving informational frictions. Keeping information private to users may better protect privacy, but it may perpetuate the exact informational frictions that lead to suboptimal social outcomes.

(Kairong Xiao is from Columbia University.)


Alvarez, Fernando, David Argente, and Francesco Lippi. “A Simple Planning Problem for COVID-19 Lockdown, Testing, and Tracing.” American Economic Review: Insights (forthcoming).

Braithwaite, Isobel, Thomas Callender, Miriam Bullock, and Robert W. Aldridge. 2020. “Automated and Partly Automated Contact Tracing: A Systematic Review to Inform the Control of COVID-19.” Lancet Digital Health 2 (11): E607–21.

Borusyak, Kirill, and Xavier Jaravel. 2017. “Revisiting Event Study Designs.”

Xiao, Kairong. 2021. “The Value of Big Data in a Pandemic.”



Most Popular