Skip to content

Latest commit

 

History

History
332 lines (253 loc) · 11 KB

example_nutrition.org

File metadata and controls

332 lines (253 loc) · 11 KB

Nutrient Demands

Introduction

In our last project we used data to estimate systems of food demand using different datasets. An output from that project was as set of cfe.Regression objects; these bundle together both data and the results from the demand system estimation, and can be used for prediction as well.

Here we’ll explore some of the uses of the cfe.Regression class, using an instance created previously (as in Project 3).

After having estimated a demand system using data from our favorite country, we can imagine different counterfactual scenarios. What if prices were different? What if we give a cash transfer to a household? What if school fees reduce the budget for food? What are the consequences of any of these for diet & nutrition?

If you don’t already have the latest version of the CFEDemands package installed, grab it, along with some dependencies:

#!pip install -r requirements.txt
import pandas as pd
import cfe.regression as rgsn

Data

We’ll get data from two places. First, basic data, including a food conversion table and recommended daily intakes table can be found in a google spreadsheet.

Here are addresses of google sheets for different dataframes for the case of Uganda:

InputFiles = {'Expenditures':('1yVLriVpo7KGUXvR3hq_n53XpXlD5NmLaH1oOMZyV0gQ','Expenditures (2019-20)'),
              'Prices':('1yVLriVpo7KGUXvR3hq_n53XpXlD5NmLaH1oOMZyV0gQ','Prices'),
              'HH Characteristics':('1yVLriVpo7KGUXvR3hq_n53XpXlD5NmLaH1oOMZyV0gQ','HH Characteristics'),
              'FCT':('1yVLriVpo7KGUXvR3hq_n53XpXlD5NmLaH1oOMZyV0gQ','FCT'),
              'RDI':('1yVLriVpo7KGUXvR3hq_n53XpXlD5NmLaH1oOMZyV0gQ','RDI'),}

Prices, FCT, RDI

from eep153_tools.sheets import read_sheets
import numpy as np
import pandas as pd

def get_clean_sheet(key,sheet=None):

    df = read_sheets(key,sheet=sheet)
    df.columns = [c.strip() for c in df.columns.tolist()]

    df = df.loc[:,~df.columns.duplicated(keep='first')]

    df = df.drop([col for col in df.columns if col.startswith('Unnamed')], axis=1)

    df = df.loc[~df.index.duplicated(), :]

    return df

# Get prices
p = get_clean_sheet(InputFiles['Prices'][0],
                    sheet=InputFiles['Prices'][1])

if 'm' not in p.columns:  # Supply "market" indicator if missing
    p['m'] = 1

p = p.set_index(['t','m'])
p.columns.name = 'j'

p = p.apply(lambda x: pd.to_numeric(x,errors='coerce'))
p = p.replace(0,np.nan)

fct = get_clean_sheet(InputFiles['FCT'][0],
                    sheet=InputFiles['FCT'][1])

fct = fct.set_index('j')
fct.columns.name = 'n'

fct = fct.apply(lambda x: pd.to_numeric(x,errors='coerce'))

################## RDI, if available (consider using US) #####################
rdi = get_clean_sheet(InputFiles['RDI'][0],
                    sheet=InputFiles['RDI'][1])
rdi = rdi.set_index('n')
rdi.columns.name = 'k'

Pre-estimated Demand Systems

An instance r of cfe.Regression can be made persistent with r.to_pickle('my_result.pickle'), which saves the instance “on disk”, and can be loaded using cfe.regression.read_pickle. We use this method below to load data and demand system previously estimated for Uganda:

r = rgsn.read_pickle('uganda_2019-20.pickle')  # Assumes you've already set this up e.g., in Project 3

Reference Prices

Choose reference prices. Here we’ll choose a particular year, and average prices across markets. If you wanted to focus on particular market you’d do this differently.

# Reference prices chosen from a particular time; average across place.
# These are prices per kilogram:
pbar = p.xs('2019-20',level='t').mean()
pbar = pbar[r.beta.index] # Only use prices for goods we can estimate

Budgets

Get food budget for all households, then find median budget:

import numpy as np

xhat = r.predicted_expenditures()

# Total food expenditures per household
xbar = xhat.groupby(['i','t','m']).sum()

# Reference budget
xref = xbar.quantile(0.5)  # Household at 0.5 quantile is median

Food Quantities

Get quantities of food by dividing expenditures by prices:

qhat = (xhat.unstack('j')/pbar).dropna(how='all')

# Drop missing columns
qhat = qhat.loc[:,qhat.count()>0]

qhat

Finally, define a function to change a single price in the vector $p$:

def my_prices(p0,p=pbar,j='Millet'):
    """
    Change price of jth good to p0, holding other prices fixed.
    """
    p = p.copy()
    p.loc[j] = p0
    return p

Demands

Demand functions

import matplotlib.pyplot as plt
%matplotlib notebook

use = 'Millet'  # Good we want demand curve for

# Vary prices from 50% to 200% of reference.
scale = np.linspace(.5,2,20)

# Demand for Millet for household at median budget
plt.plot([r.demands(xref,my_prices(pbar[use]*s,pbar))[use] for s in scale],scale)

# Demand for Millet for household at 25% percentile
plt.plot([r.demands(xbar.quantile(0.25),my_prices(pbar[use]*s,pbar))[use] for s in scale],scale)

# Demand for Millet for household at 75% percentile
plt.plot([r.demands(xbar.quantile(0.75),my_prices(pbar[use]*s,pbar))[use] for s in scale],scale)

plt.ylabel(f"Price (relative to base of {pbar[use]:.2f})")
plt.xlabel(f"Quantities of {use} Demanded")

Engel Curves

fig,ax = plt.subplots()

scale = np.geomspace(.01,10,50)

ax.plot(np.log(scale*xref),[r.expenditures(s*xref,pbar)/(s*xref) for s in scale])
ax.set_xlabel(f'log budget (relative to base of {xref:.0f})')
ax.set_ylabel(f'Expenditure share')
ax.set_title('Engel Curves')

Mapping to Nutrients

We’ve seen how to map prices and budgets into vectors of consumption quantities using cfe.Regression.demands. Next we want to think about how to map these into bundles of nutrients. The information needed for the mapping comes from a “Food Conversion Table” (or database, such as the USDA Food Data Central). We’ve already grabbed an FCT, let’s take a look:

fct

We need the index of the Food Conversion Table (FCT) to match up with the index of the vector of quantities demanded. To manage this we make use of the align method for pd.DataFrames:

# Create a new FCT and vector of consumption that only share rows in common:
fct0,c0 = fct.align(qhat.T,axis=0,join='inner')
print(fct0.index)

Now, since rows of fct0 and c0 match, we can obtain nutritional outcomes from the inner (or dot, or matrix) product of the transposed fct0 and c0:

# The @ operator means matrix multiply
N = fct0.T@c0

N  #NB: Uganda quantities are for previous 7 days

Of course, since we can compute the nutritional content of a vector of consumption goods c0, we can also use our demand functions to compute nutrition as a function of prices and budget.

def nutrient_demand(x,p):
    c = r.demands(x,p)
    fct0,c0 = fct.align(c,axis=0,join='inner')
    N = fct0.T@c0

    N = N.loc[~N.index.duplicated()]
    
    return N

With this nutrient_demand function in hand, we can see how nutrient outcomes vary with budget, given prices:

import numpy as np
import matplotlib.pyplot as plt

X = np.linspace(xref/5,xref*5,50)

UseNutrients = ['Protein','Energy','Iron','Calcium','Vitamin C']

df = pd.concat({myx:np.log(nutrient_demand(myx,pbar))[UseNutrients] for myx in X},axis=1).T
ax = df.plot()

ax.set_xlabel('log budget')
ax.set_ylabel('log nutrient')

Now how does nutrition vary with prices?

USE_GOOD = 'Oranges'

scale = np.geomspace(.01,10,50)

ndf = pd.DataFrame({s:np.log(nutrient_demand(xref/2,my_prices(pbar[USE_GOOD]*s,j=USE_GOOD)))[UseNutrients] for s in scale}).T

ax = ndf.plot()

ax.set_xlabel('log price')
ax.set_ylabel('log nutrient')

Nutritional Needs of Households

Our data on demand and nutrients is at the household level; we can’t directly compare household level nutrition with individual level requirements. What we can do is add up minimum individual requirements, and see whether household total exceed these. This isn’t a guarantee that all individuals have adequate nutrition (since the way food is allocated in the household might be quite unequal, or unrelated to individual requirements), but it is necessary if all individuals are to have adequate nutrition.

For the average household in our data, the number of different kinds of people can be computed by averaging over households:

# In first round, averaged over households and villages
dbar = r.d.mean().iloc[:-2]

Now, the inner/dot/matrix product between dbar and the rdi DataFrame of requirements will give us minimum requirements for the average household:

# This matrix product gives minimum nutrient requirements for
# the average household
hh_rdi = rdi.replace('',0)@dbar

hh_rdi

Nutritional Adequacy of Food Demands

Since we can trace out demands for nutrients as a function of $(x,p)$, and we’ve computed minimum nutritional requirements for the average household, we can normalize nutritional intake to check the adequacy of diet for a household with counts of different kinds of people given by z.

def nutrient_adequacy_ratio(x,p,d,rdi=rdi,days=7):
    hh_rdi = rdi.replace('',0)@d*days

    return nutrient_demand(x,p)/hh_rdi

In terms of normalized nutrients, any household with more than one unit of any given nutrient (or zero in logs) will be consuming a minimally adequate level of the nutrient; below this level there’s clearly nutritional inadequacy. For this reason the ratio of actual nutrients to required nutrients is termed the “nutrient adequacy ratio,” or NAR.

X = np.geomspace(.01*xref,2*xref,100)

pd.DataFrame({x:np.log(nutrient_adequacy_ratio(x,pbar,dbar))[UseNutrients] for x in X}).T.plot()
plt.legend(UseNutrients)
plt.xlabel('budget')
plt.ylabel('log nutrient adequacy ratio')
plt.axhline(0)
plt.axvline(xref)

As before, we can also vary relative prices. Here we trace out nutritional adequacy varying the price of a single good:

scale = np.geomspace(.01,2,50)

ndf = pd.DataFrame({s*pbar[USE_GOOD]:np.log(nutrient_adequacy_ratio(xref/4,my_prices(pbar[USE_GOOD]*s,j=USE_GOOD),dbar))[UseNutrients] for s in scale}).T

fig,ax = plt.subplots()
ax.plot(ndf['Vitamin C'],ndf.index)
ax.axhline(pbar[USE_GOOD])
ax.axvline(0)

ax.set_ylabel('Price')
ax.set_xlabel('log nutrient adequacy ratio')