Covid-19 relief in Togo
Since the onset of the COVID-19 pandemic, more than 100 million individuals are estimated to have transitioned into extreme poverty. In response to this, social assistance programs have provided more than US$800 billion in cash transfer payments to over 1.5 billion people across the globe (roughly one fifth of the world’s population). A proxy for socioeconomic status is often used to determine eligibility for these social assistance programs, such as recent household income data. However, this income data is often not formally recorded or available in low-income and lower-middle-income countries, where economic activity is often informal and based on home-produced agriculture.
Aiken et al. have recently published an article in Nature (open access), demonstrating a valuable role for machine learning based on phone metadata to estimate the income of individual phone subscribers accurately and, consequently, their eligibility for humanitarian aid as the poorest population subset. Their results are based on population targeting with Novissi – a flagship emergency social assistance program carried out in Togo in 2020. “Novissi was built and designed in order to help those people who are the most vulnerable population and the most impacted by the anti-COVID measures,” the Togolese minister C. Lawson explained.
Novissi was initially designed to benefit informal workers in Greater Lomé (the large metropolitan area surrounding the capital city), where lockdown orders were initially focused. The rationale for targeting informal workers was that they were more likely to be vulnerable and more likely to be affected by the lockdown orders. Initial eligibility for Novissi was determined based on a national voter registry that had been updated in late 2019. Benefits initially targeted individuals who met three criteria: (1) ‘self-targeted by dialing in to the Novissi platform and entering basic information from their mobile phone; (2) registered to vote in the Greater Lomé region around the capital city; and (3) self-declared to work in an informal occupation.
Machine-learning
Aiken et al.’s research efforts focused on helping the government expand the Novissi program to targeting the poorest individuals in rural regions of the country. Eligible beneficiaries received bi-weekly payments of roughly US$10. Under the critical time pressures of a socioeconomic crisis, these machine-learning targeting strategies offer a new way to disburse grants to those most in need of assistance. The Novissi program had sufficient funds to target 29% of eligible registrants. For context, this 29th percentile corresponds to a consumption threshold of US$1.18 per day in the 2020 phone survey dataset, which falls below the extreme poverty line of US$1.43 per day. The extreme poverty line is defined as three-quarters of the international poverty line of US$1.90 per day.
Methods and data sources
The core analysis relies heavily on two surveys conducted. The first survey, which is nationally representative and included 6,171 people, was conducted in the field in 2018 and 2019. The second survey included 8,915 people and was conducted over the phone in September 2020. Based on this survey data, researchers constructed four poverty outcomes: consumption expenditure, an asset-based wealth index, a poverty probability index (PPI), and a Proxy Means Test (PMT). Consumption expenditure was based on disaggregated expenditures for more than 200 food and non-food items. Asset indices were dominated by variation in ownership of three main assets: toilet, radio, and motorcycle, and also included a host of other assets. ‘Poverty probability’ was scored based on ten household questions, including region of residence, education of adults and children, asset ownership, and consumption of sugar. Proxy Means Test scores were constructed to select the 12 asset and demographic variables that are jointly most predictive of per capita household consumption.
Mobile phone metadata contained the following information. Calls: caller phone number, recipient phone number, date, time, and duration of the call, and ID of the cell tower through which the call is placed; SMS messages: sender phone number, recipient phone number, date and time of the message, ID of the antenna through which the message is sent; mobile data usage: phone number, date and time of the transaction, amount of data consumed; mobile money transactions: Sender phone number, recipient phone number (if peer-to-peer), date, time, and amount of the transaction, and broad category of the transaction type (cash in, cash out, peer-to-peer or bill pay).
These data were then interpreted further, ranging from general statistics (for example, number of calls or SMS messages, or balance of incoming versus outgoing transactions), to social network characteristics (for example, number and diversity of contacts), to measures of mobility based on cell tower locations (for example, number of unique towers and radius of gyration). The number and duration of outgoing international transactions were calculated using country codes associated with phone numbers, for both calls and SMS messages.
Training the algorithm
In the first step, researchers obtained public micro-estimates of the relative wealth of every 2.4 km by 2.4 km region in Togo, which were constructed by applying machine-learning algorithms to high-resolution satellite imagery. These estimates indicate the relative wealth of all households in each small grid cell. They then estimated the average daily consumption of each mobile phone subscriber by applying machine-learning algorithms to mobile phone metadata provided by Togo’s two mobile phone operators. In two large and representative surveys conducted, various questions concerned consumption expenditure and asset ownership for participants. This data was then used to estimate the relative wealth and/or consumption of each mobile phone subscriber, and overlaid onto detailed metadata of each subscriber’s history of phone use. This sample was used to train supervised machine learning algorithms that predict wealth and consumption based on phone use.
This phone-based approach relies heavily on machine learning to construct a poverty score for each mobile subscriber, where eligibility is a complex function of how each subscriber uses their phone. Compared with an alternative approach that does not use machine learning, but rather targets mobile phone subscribers with the lowest mobile phone expenditures, the machine-learning-based model performs substantially better. For the machine-learning model to be successful, the recency of collected data is an important factor — an individual’s poverty status can change over time, and the best phone-based predictors of wealth may also change.
Assessing the targeting approach
Researchers evaluated the performance of this new targeting approach that combines machine learning and mobile phone data by comparing it to three counterfactual approaches: a geographic targeting approach that the government piloted in the summer of 2020, in which all individuals are eligible within the poorest prefectures; occupation-based targeting (such as Novissi’s original approach to targeting informal workers); and a method based on phone data without machine learning (using total expenditures on calling and texting as a proxy for wealth). They found that phone-based targeting (0.70) outperforms the other feasible methods of targeting rural Novissi aid. As a result, errors of exclusion are lower for the phone-based approach (53%) than for feasible alternatives (59%–78%).
When researchers simulated a hypothetical national anti-poverty program, their phone-based targeting approach also outperformed other methods at effectively targeting the poorest participants. Regarding this hypothetical scaling up of the program, one exception concerns occupation-based targeting, where focussing on the poorest occupational category (agricultural workers) slightly outperforms phone-based targeting. This indicates that the phone-based targeting approach was more effective in the actual rural Novissi program than it would be in a hypothetical nationwide program.
This new method of targeting the poorest individuals in need of financial aid may provide a necessary and welcome addition to the toolkit of humanitarian agencies, particularly during times of crisis. When it comes to areas of the globe and populations that are most in need of assistance during crises, such as the COVID-19 pandemic, public data sources are often incomplete or out of date. Aiken et al. demonstrated how mobile phone data combined with machine-learning can give us a clearer picture in real-time for where humanitarian assistance can be most impactful.
References
Aiken, E., Bellue, S., Karlan, D., Udry, C. and Blumenstock, J., (2022). Machine learning and phone data can improve targeting of humanitarian aid. Nature, 603(7903), pp.864-870.