Leveraging Unsupervised Machine Learning to Examine Women's Vulnerability to Climate Change
Abstract
A leading source of income risk in Malawi is the occurrence of droughts. Liquidity-constrained households engage in early age marital contracts as a form of household consumption-smoothing This can force young women to transition toward low-return domestic activities rather than high-return schooling activities and raise fertility rates affecting their ability to engage in paid employment. We estimate the effect of drought on early age marital decisions, schooling, and fertility rates in Malawi to inform expectations of future labor force losses attributable to climate change.The impact of drought exposure is based on difference-in-difference estimates that rely on big data and machine learning techniques. We use administrative data from the 2008 and 2018 Malawi censuses and treatment status is determined by a k-means, machine learning algorithm. The algorithm classifies drought based on previous exposure to rainfall anomalies derived from the Climate Hazards Group InfraRed Precipitation with Station (CHIRPS) data. We compare climate impact estimates based on the drought indicators established objectively from the k-means algorithm to more traditional measures implemented in the literature. Robustness checks and falsification exercises are used to support the interpretation of the drought impacts as causal. We find that young women exposed to a drought five years prior to their interview were 5 percentage points more likely to be married by 18 than those living in non-drought areas, Increases in early age marriage coincide with a rise in fertility (3 percentage points) and decline in the completion of primary (2 percentage points) and secondary school (1 percentage point) which can affect labor force participation rates. We use these figures to project that the increased odds of early age marriage by 2100 under the extreme RCP scenario will induce 3.3 million women to exit the labor force.