Mapping urban poverty from space
Cities act as escalators out of poverty, as Edward Glaeser demonstrates. And yet, this very function drives poorer immigrants into urban regions – in search of jobs or access to better public services., but most social assistance programs in developing countries have traditionally concentrated on the rural poor, with the bulk of past efforts on proper identification and targeting. The pandemic has shone a spotlight on the plight of the urban poor. When the pandemic hit, in the face of harsh lockdowns, the urban poor faced sudden shocks to their income and consumption. Governments in several countries were caught unprepared, with many lacking data on important questions – who are the urban poor? where do they live? how best to find them, and help them?
Finding the poor and those vulnerable to poverty is never an easy exercise. However, in response to the COVID-19 crisis, the World Bank began collecting data on the urban poor, private establishments, monitoring of households, and deployment of vaccines by using several innovative techniques. Data collection in densely populated urban regions, often where informal employment is widespread, and access to public services is an important element of consumption, is especially challenging. Additionally, data requirements can be burdensome when swift and targeted action is vital. Even when recent data exist, insufficient geographical disaggregation can hinder their use (ie, sampling at the regional or district level) or add constraints on access to the data (ie, the confidentiality of census data).
Joining forces with machines
By showing machine images of poor settlements seen from outer space, and then, iteratively, testing its capabilities to reduce the scope for error, we can teach the computer to recognize poverty within cities. In data-scientist terms, this implies designing a supervised machine learning algorithm that is based on the use of training data. Such data consist of samples collected by local experts either through a direct assessment on the ground and/or via analysis of very high-resolution satellite imagery. The algorithm then builds a model by analyzing the “signature” of these samples – a model that then “predicts” the attributes of the remaining datasets based on the classes used in the sample data.
Figure 1 describes this process visually. This involves the collection of data (step 1) consisting of polygons delimitating certain types of urban settlements – residential (poor/informal, middle, or high-income), commercial, or industrial. These are analyzed by the algorithm (step 2) to build a model to predict (step 3) the urban settlement typology for the rest of the city. Of course, different elements of geospatial data can be fed to power the algorithm, including, for instance, polygons of building footprints, characteristics of transport networks, tree cover, public spaces, waste dumpsites, and the like.
Figure 1. Supervised machine learning workflow