# The diet problem

This tutorial was generated using Literate.jl. Download the source as a .jl file.

The purpose of this tutorial is to demonstrate how to incorporate DataFrames into a JuMP model. As an example, we use classic Stigler diet problem.

## Required packages

This tutorial requires the following packages:

using JuMP
import CSV
import DataFrames
import HiGHS
import Test

## Formulation

We wish to cook a nutritionally balanced meal by choosing the quantity of each food $f$ to eat from a set of foods $F$ in our kitchen.

Each food $f$ has a cost, $c_f$, as well as a macro-nutrient profile $a_{m,f}$ for each macro-nutrient $m \in M$.

Because we care about a nutritionally balanced meal, we set some minimum and maximum limits for each nutrient, which we denote $l_m$ and $u_m$ respectively.

Furthermore, because we are optimizers, we seek the minimum cost solution.

With a little effort, we can formulate our dinner problem as the following linear program:

\begin{aligned} \min & \sum\limits_{f \in F} c_f x_f \\ \text{s.t.}\ \ & l_m \le \sum\limits_{f \in F} a_{m,f} x_f \le u_m, && \forall m \in M \\ & x_f \ge 0, && \forall f \in F. \end{aligned}

In the rest of this tutorial, we will create and solve this problem in JuMP, and learn what we should cook for dinner.

## Data

First, we need some data for the problem. For this tutorial, we'll write CSV files to a temporary directory from Julia. If you have existing files, you could change the filenames to point to them instead.

dir = mktempdir()
"/tmp/jl_KIbH76"

The first file is a list of foods with their macro-nutrient profile:

food_csv_filename = joinpath(dir, "diet_foods.csv")
open(food_csv_filename, "w") do io
write(
io,
"""
name,cost,calories,protein,fat,sodium
hamburger,2.49,410,24,26,730
chicken,2.89,420,32,10,1190
hot dog,1.50,560,20,32,1800
fries,1.89,380,4,19,270
macaroni,2.09,320,12,10,930
pizza,1.99,320,15,12,820
salad,2.49,320,31,12,1230
milk,0.89,100,8,2.5,125
ice cream,1.59,330,8,10,180
""",
)
return
end
foods = CSV.read(food_csv_filename, DataFrames.DataFrame)
9×6 DataFrame
Rownamecostcaloriesproteinfatsodium
String15Float64Int64Int64Float64Int64
1hamburger2.494102426.0730
2chicken2.894203210.01190
3hot dog1.55602032.01800
4fries1.89380419.0270
5macaroni2.093201210.0930
6pizza1.993201512.0820
7salad2.493203112.01230
8milk0.8910082.5125
9ice cream1.59330810.0180

Here, $F$ is foods.name and $c_f$ is foods.cost. (We're also playing a bit loose the term "macro-nutrient" by including calories and sodium.)

We also need our minimum and maximum limits:

nutrient_csv_filename = joinpath(dir, "diet_nutrient.csv")
open(nutrient_csv_filename, "w") do io
write(
io,
"""
nutrient,min,max
calories,1800,2200
protein,91,
fat,0,65
sodium,0,1779
""",
)
return
end
limits = CSV.read(nutrient_csv_filename, DataFrames.DataFrame)
4×3 DataFrame
Rownutrientminmax
String15Int64Int64?
1calories18002200
2protein91missing
3fat065
4sodium01779

Protein is missing data for the maximum. Let's fix that using coalesce:

limits.max = coalesce.(limits.max, Inf)
limits
4×3 DataFrame
Rownutrientminmax
String15Int64Real
1calories18002200
2protein91Inf
3fat065
4sodium01779

## JuMP formulation

Now we're ready to convert our mathematical formulation into a JuMP model.

First, create a new JuMP model. Since we have a linear program, we'll use HiGHS as our optimizer:

model = Model(HiGHS.Optimizer)
set_silent(model)

Next, we create a set of decision variables x, with one element for each row in the DataFrame, and each x has a lower bound of 0:

@variable(model, x[foods.name] >= 0)
1-dimensional DenseAxisArray{VariableRef,1,...} with index sets:
Dimension 1, InlineStrings.String15["hamburger", "chicken", "hot dog", "fries", "macaroni", "pizza", "salad", "milk", "ice cream"]
And data, a 9-element Vector{VariableRef}:
x[hamburger]
x[chicken]
x[hot dog]
x[fries]
x[macaroni]
x[pizza]
x[salad]
x[milk]
x[ice cream]

To simplify things later on, we store the vector as a new column x in the DataFrame foods. Since x is a DenseAxisArray, we first need to convert it to an Array:

foods.x = Array(x)
9-element Vector{VariableRef}:
x[hamburger]
x[chicken]
x[hot dog]
x[fries]
x[macaroni]
x[pizza]
x[salad]
x[milk]
x[ice cream]

Our objective is to minimize the total cost of purchasing food:

@objective(model, Min, sum(foods.cost .* foods.x));

For the next component, we need to add a constraint that our total intake of each component is within the limits contained in the limits DataFrame:

@constraint(
model,
[row in eachrow(limits)],
row.min <= sum(foods[!, row.nutrient] .* foods.x) <= row.max,
);

What does our model look like?

print(model)
Min 2.49 x[hamburger] + 2.89 x[chicken] + 1.5 x[hot dog] + 1.89 x[fries] + 2.09 x[macaroni] + 1.99 x[pizza] + 2.49 x[salad] + 0.89 x[milk] + 1.59 x[ice cream]
Subject to
410 x[hamburger] + 420 x[chicken] + 560 x[hot dog] + 380 x[fries] + 320 x[macaroni] + 320 x[pizza] + 320 x[salad] + 100 x[milk] + 330 x[ice cream] ∈ [1800, 2200]
24 x[hamburger] + 32 x[chicken] + 20 x[hot dog] + 4 x[fries] + 12 x[macaroni] + 15 x[pizza] + 31 x[salad] + 8 x[milk] + 8 x[ice cream] ∈ [91, Inf]
26 x[hamburger] + 10 x[chicken] + 32 x[hot dog] + 19 x[fries] + 10 x[macaroni] + 12 x[pizza] + 12 x[salad] + 2.5 x[milk] + 10 x[ice cream] ∈ [0, 65]
730 x[hamburger] + 1190 x[chicken] + 1800 x[hot dog] + 270 x[fries] + 930 x[macaroni] + 820 x[pizza] + 1230 x[salad] + 125 x[milk] + 180 x[ice cream] ∈ [0, 1779]
x[hamburger] ≥ 0
x[chicken] ≥ 0
x[hot dog] ≥ 0
x[fries] ≥ 0
x[macaroni] ≥ 0
x[pizza] ≥ 0
x[salad] ≥ 0
x[milk] ≥ 0
x[ice cream] ≥ 0

## Solution

Let's optimize and take a look at the solution:

optimize!(model)
@assert is_solved_and_feasible(model)
solution_summary(model)
* Solver : HiGHS

* Status
Result count       : 1
Termination status : OPTIMAL
Message from the solver:
"kHighsModelStatusOptimal"

* Candidate solution (result #1)
Primal status      : FEASIBLE_POINT
Dual status        : FEASIBLE_POINT
Objective value    : 1.18289e+01
Objective bound    : 0.00000e+00
Relative gap       : Inf
Dual objective value : 1.18289e+01

* Work counters
Solve time (sec)   : 2.12193e-04
Simplex iterations : 6
Barrier iterations : 0
Node count         : -1


We found an optimal solution. Let's see what the optimal solution is:

for row in eachrow(foods)
println(row.name, " = ", value(row.x))
end
hamburger = 0.6045138888888871
chicken = 0.0
hot dog = 0.0
fries = 0.0
macaroni = 0.0
pizza = 0.0
salad = 0.0
milk = 6.9701388888888935
ice cream = 2.5913194444444447

That's a lot of milk and ice cream, and sadly, we only get 0.6 of a hamburger.

We can also use the function Containers.rowtable to easily convert the result into a DataFrame:

table = Containers.rowtable(value, x; header = [:food, :quantity])
solution = DataFrames.DataFrame(table)
9×2 DataFrame
Rowfoodquantity
String15Float64
1hamburger0.604514
2chicken0.0
3hot dog0.0
4fries0.0
5macaroni0.0
6pizza0.0
7salad0.0
8milk6.97014
9ice cream2.59132

This makes it easy to perform analyses our solution:

filter!(row -> row.quantity > 0.0, solution)
3×2 DataFrame
Rowfoodquantity
String15Float64
1hamburger0.604514
2milk6.97014
3ice cream2.59132

## Problem modification

JuMP makes it easy to take an existing model and modify it by adding extra constraints. Let's see what happens if we add a constraint that we can buy at most 6 units of milk or ice cream combined.

dairy_foods = ["milk", "ice cream"]
is_dairy = map(name -> name in dairy_foods, foods.name)
dairy_constraint = @constraint(model, sum(foods[is_dairy, :x]) <= 6)
optimize!(model)
Test.@test !is_solved_and_feasible(model)
Test.@test termination_status(model) == INFEASIBLE
Test.@test primal_status(model) == NO_SOLUTION
solution_summary(model)
* Solver : HiGHS

* Status
Result count       : 1
Termination status : INFEASIBLE
Message from the solver:
"kHighsModelStatusInfeasible"

* Candidate solution (result #1)
Primal status      : NO_SOLUTION
Dual status        : INFEASIBILITY_CERTIFICATE
Objective value    : 1.18289e+01
Objective bound    : 0.00000e+00
Relative gap       : Inf
Dual objective value : 3.56146e+00

* Work counters
Solve time (sec)   : 1.51634e-04
Simplex iterations : 0
Barrier iterations : 0
Node count         : -1


There exists no feasible solution to our problem. Looks like we're stuck eating ice cream for dinner.

## Next steps

• You can delete a constraint using delete(model, dairy_constraint). Can you add a different constraint to provide a diet with less dairy?
• Some food items (like hamburgers) are discrete. You can use set_integer to force a variable to take integer values. What happens to the solution if you do?