Looking at ideal low-carb fat/protein ratio using Jupyter Notebook

In this blog post I want to do two things that I haven't done for a long time. I am going to blog about nutrition, and I'm going to write this blog using Jupyter Notebook an Python.

This blog is both menat to be informational regarding macros from a nutritional perspective, and I'm going to walk you through the process of looking at the theory with Python and Jupyter notebook in a way that makes this a little bit of a tutorial.

In order to use Jupyter for the things we wnt to do today, the first thing we need to do is import a few python libraries.

import matplotlib
import numpy as np
import pandas as pd
import math
%matplotlib inline

The numpy library is an amazing Python library for working with arrays.
Pandas extends on numpy by providing what basically results in an in-code spreadsheet.
Think of pandas, especialy when running inside of Jupyter, as excel on steroids.

The matplotlib is a library that is used to plot the data and visualize our findings.

Now that we have our base requirements, it is time to look at some nutrition equations and put them into Python functions.

We are going to start odd looking at a few well known formulas from nutrition. The first one is the Katch-McArdle equation for daily energy expendature.

It takes two variables. the lean body mass of the person. This is the total body mass minus the weight of the adipose tissue, and a so called activity factor that is an indication of how active a person is.

how active	activity factor
sedentary	1.2
light activity	1.375
moderate activity	1.55
high activity	1.725
extreme activity	1.9

The second equation uses a timplified version of one of two often used rule ot thumb.

1 gram of protein per lbs of lean body mass
2.5 grams of protein per kg of lean body mass

There are other equations sometimes being used using total body mass with a lower number for protein intake per kg ranging from as low as 0.8 gram per kg (though that number usually comes from people with an animal right of vegan background) up to 2 gram per kg.

For this blog, knowing that the energy content of 1 gram of protein equates 4 kcal, we are going with the second statement for out second equation.

The ideal total protein-sourced energy intake is 10 kcal per kg of lean body mass.

Now we need to pick sides. Do we want to go low fat, low carb, or somewhere in between. Looking at development in nutrition in recent decades, and ignoring the animal rights tainted vegans for a bit, we are going for low carb for this blog. There are different definitions of what would constitute low carb, and we both want to take the concept serious and steer away from the absolute zero idea that might be a bit extreme. We pick the often cited number of 50 grams, or 200 kcal from carbohydrates.

Then finaly, fat. If we wanted to be complete we would add alcohol to the list because its another source of energy for many of us, but to keep this blog simple, we assume that all calories not from either protein nor carbohydrates come from fat.

So lets express this in Python

def katch_mcardle(lbm, activity):
    return (370 + 23.6 * lbm) * activity

def protein_kcal(lbm, activity):
    return 10*lbm

def carb_kcal(lbm, activity):
    return 200

def fat_kcal(lbm, activity):
    return (katch_mcardle(lbm, activity) -
            protein_kcal(lbm, activity) -
            carb_kcal(lbm, activity))

Now lets look a little bit into numpy, pandas and the posibilities to plot data.
We start off with a simple function calculating the ideal percentage of calories that should come from protein.

def protein_pct_kcal(lbm, activity):
    return 100 * protein_kcal(lbm, activity) / katch_mcardle(lbm, activity)

Let's demonstrate. We take a lean body mass of 70 kg for a sedentary person (activity 1.2) and look what we get.

protein_pct_kcal(70,1.2)

28.84932410154962

The protein_pct_kcal function takes two arguments, the lean body mass and the activity factor.
Let's say we need a function that takes just the lean body mass as argument. We can make such a function using Python't lambda expressions.

sedentary_pct_kcal = lambda x : protein_pct_kcal(x, 1.2)
sedentary_pct_kcal(70)

28.84932410154962

It is good to be able to get the result of this function for just one value of lbm. But when we want to plot a graph, we want to do so for a collection of values.
For this the numpy offers two things: The linspace with what we can define a range of equally spaced input values, and the vectorize feature that allows us to turn a function that takes a scalar as argument into a function that takes an array as input instead and calculated all the results at once.

We start off defining a linspace ranging from 20 kg till 90 kg for our lean body mass.

lbm = np.linspace(20,90,71)
lbm

array([20., 21., 22., 23., 24., 25., 26., 27., 28., 29., 30., 31., 32.,
       33., 34., 35., 36., 37., 38., 39., 40., 41., 42., 43., 44., 45.,
       46., 47., 48., 49., 50., 51., 52., 53., 54., 55., 56., 57., 58.,
       59., 60., 61., 62., 63., 64., 65., 66., 67., 68., 69., 70., 71.,
       72., 73., 74., 75., 76., 77., 78., 79., 80., 81., 82., 83., 84.,
       85., 86., 87., 88., 89., 90.])

Finaly we vectorize the function and invoke it with our linspace lbm array

vectorized_sedentary_pct_kcal = np.vectorize(lambda x : protein_pct_kcal(x, 1.2))
vectorized_sedentary_pct_kcal(lbm)

array([19.79414093, 20.21719039, 20.61778378, 20.99766287, 21.35839385,
       21.70138889, 22.02792463, 22.33915806, 22.63614021, 22.91982802,
       23.19109462, 23.45073832, 23.69949046, 23.93802228, 24.16695098,
       24.38684504, 24.59822893, 24.8015873 , 24.9973687 , 25.18598884,
       25.36783359, 25.54326156, 25.71260652, 25.87617947, 26.03427057,
       26.18715084, 26.33507374, 26.47827655, 26.61698163, 26.75139762,
       26.88172043, 27.00813421, 27.13081225, 27.24991774, 27.36560448,
       27.47801759, 27.58729408, 27.69356343, 27.79694809, 27.89756393,
       27.99552072, 28.09092249, 28.18386792, 28.27445067, 28.3627597 ,
       28.44887955, 28.53289064, 28.61486948, 28.69488893, 28.77301841,
       28.8493241 , 28.92386912, 28.99671371, 29.06791539, 29.13752914,
       29.20560748, 29.27220065, 29.33735674, 29.40112177, 29.4635398 ,
       29.52465309, 29.5845021 , 29.64312569, 29.70056109, 29.75684407,
       29.81200898, 29.86608879, 29.91911522, 29.97111874, 30.02212867,
       30.07217322])

So far we have been using raw numpy arrays. Fr our purpose though it is usefull to switch to using pandas dataframes.
Think of a dataframe like a spreadsheet. Let's put our lbm linespace into one column of this spreadsheet and then put the result of our vectorized version of our function into a second column.

df1 = pd.DataFrame()
df1["lbm"] = np.linspace(20,90,71)
df1["sedentary"] = np.vectorize(lambda x : protein_pct_kcal(x, 1.2))(df1["lbm"])
df1.head(10)

	lbm	sedentary
0	20.0	19.794141
1	21.0	20.217190
2	22.0	20.617784
3	23.0	20.997663
4	24.0	21.358394
5	25.0	21.701389
6	26.0	22.027925
7	27.0	22.339158
8	28.0	22.636140
9	29.0	22.919828

df1["light"] = np.vectorize(lambda x : protein_pct_kcal(x, 1.375))(df1["lbm"])
df1["moderate"] = np.vectorize(lambda x : protein_pct_kcal(x, 1.55))(df1["lbm"])
df1["very"] = np.vectorize(lambda x : protein_pct_kcal(x, 1.725))(df1["lbm"])
df1["extremely"] = np.vectorize(lambda x : protein_pct_kcal(x, 1.9))(df1["lbm"])
df1.head(10)

	lbm	sedentary	light	moderate	very	extremely
0	20.0	19.794141	17.274887	15.324496	13.769837	12.501563
1	21.0	20.217190	17.644093	15.652018	14.064132	12.768752
2	22.0	20.617784	17.993702	15.962155	14.342806	13.021758
3	23.0	20.997663	18.325233	16.256255	14.607070	13.261682
4	24.0	21.358394	18.640053	16.535531	14.858013	13.489512
5	25.0	21.701389	18.939394	16.801075	15.096618	13.706140
6	26.0	22.027925	19.224371	17.053877	15.323774	13.912373
7	27.0	22.339158	19.495992	17.294832	15.540284	14.108942
8	28.0	22.636140	19.755177	17.524754	15.746880	14.296510
9	29.0	22.919828	20.002759	17.744383	15.944228	14.475681

Looks pretty good already, but let's make lbm our row index.

df1.set_index("lbm", inplace=True)
df1.head(10)

	sedentary	light	moderate	very	extremely
lbm
20.0	19.794141	17.274887	15.324496	13.769837	12.501563
21.0	20.217190	17.644093	15.652018	14.064132	12.768752
22.0	20.617784	17.993702	15.962155	14.342806	13.021758
23.0	20.997663	18.325233	16.256255	14.607070	13.261682
24.0	21.358394	18.640053	16.535531	14.858013	13.489512
25.0	21.701389	18.939394	16.801075	15.096618	13.706140
26.0	22.027925	19.224371	17.053877	15.323774	13.912373
27.0	22.339158	19.495992	17.294832	15.540284	14.108942
28.0	22.636140	19.755177	17.524754	15.746880	14.296510
29.0	22.919828	20.002759	17.744383	15.944228	14.475681

Now it's time to call in matplotlib

df1.plot(grid=True)

<AxesSubplot:xlabel='lbm'>

We see that extremely active people with a really low lean body massshould have enugh consuming less than 15% of calories from protein, while sedentary people with extemely high lean body mass should consume double taht percentage. It is clear to see that there is no one size fits all here.

We all know counting calories is a drag, and many claim its a bad idea, and its hard not to agree. Lets take one step towards not counting calories by for now just counting grams instead.

Both protein anf carbohydrated have roughtly 4 kcal per gram and fat about 9 kcal.

def protein_grams(lbm, activity):
    return protein_kcal(lbm, activity) / 4.0

def carb_grams(lbm, activity):
    return carb_kcal(lbm, activity) / 4.0

def fat_grams(lbm, activity):
    return fat_kcal(lbm, activity) / 9.0

We do the same as we did before, but now with grams instead of kcal. Lets start off with our function.

def protein_pct_gram(lbm, activity):
    return (100 * protein_grams(lbm, activity) /
            (protein_grams(lbm, activity) +
             carb_grams(lbm, activity) +
             fat_grams(lbm, activity)))

The rest is pretty much the same as before

df2 = pd.DataFrame()
df2["lbm"] = np.linspace(20,90,71)
df2["sedentary"] = np.vectorize(lambda x : protein_pct_gram(x, 1.2))(df2["lbm"])
df2["light"] = np.vectorize(lambda x : protein_pct_gram(x, 1.375))(df2["lbm"])
df2["moderate"] = np.vectorize(lambda x : protein_pct_gram(x, 1.55))(df2["lbm"])
df2["very"] = np.vectorize(lambda x : protein_pct_gram(x, 1.725))(df2["lbm"])
df2["extremely"] = np.vectorize(lambda x : protein_pct_gram(x, 1.9))(df2["lbm"])
df2.set_index("lbm", inplace=True)
df2.plot(grid=True)

<AxesSubplot:xlabel='lbm'>

Notice that all the numbers are a little bit higher. This stems from the higher energy density of fat.

Now let's take our road away from counting calories one step further. Let's look at out fat to protein ratio and see how that comes out.

def fat_to_protein_ratio(lbm, activity):
    return fat_grams(lbm, activity) /  protein_grams(lbm, activity)

df3 = pd.DataFrame()
df3["lbm"] = np.linspace(20,90,71)
df3["sedentary"] = np.vectorize(lambda x : fat_to_protein_ratio(x, 1.2))(df3["lbm"])
df3["light"] = np.vectorize(lambda x : fat_to_protein_ratio(x, 1.375))(df3["lbm"])
df3["moderate"] = np.vectorize(lambda x : fat_to_protein_ratio(x, 1.55))(df3["lbm"])
df3["very"] = np.vectorize(lambda x : fat_to_protein_ratio(x, 1.725))(df3["lbm"])
df3["extremely"] = np.vectorize(lambda x : fat_to_protein_ratio(x, 1.9))(df3["lbm"])
df3.set_index("lbm", inplace=True)
df3.plot(grid=True)

<AxesSubplot:xlabel='lbm'>

Things get a little interesting now.We are seeing while lean body mass definetely has an impact on the ideal ratio, the activity factor is most leading. For sedentary people, a fat to protein ratio of roughly one seems to be OK, adding in some extra fat for people with really low body fat.

For extremely active individuals though, something closer to a ratio of two grams of fat for every fram of protein makes a lot of sense.

I hope this little walktrhough on using numpy, pandas and matplotlib in Jupyter notebook to visualize some basic theory and assumptions has been usefull for you as a reader, and for those of you less interested in that part of the subject, I hope the resulting graphs have given you some insights in this aspect of nutrition.