The Dietary Reference Intakes (DRI) is the general term for a set of nutritional recommendations from the National Academy of Medicine used to plan and assess nutrient intakes of healthy people. These values, which vary by age and sex, include:
For a given nutrient, the Recommended Dietary Allowance (RDA) is the average daily level of intake sufficient to meet the nutrient requirements of nearly all (97 %-98 %) healthy people.
For a given nutrient, the Tolerable Upper Intake Level (UL) is the maximum daily intake unlikely to cause adverse health effects.
The glycemic load is a number that estimate how much the food will raise your blood glucose level after eating it.
One unit of glycemic load approximates the effect of eating one gram of glucose.
Glycemic load accounts for how much carbohydrate is in the food and how much each gram of carbohydrate in the food raises blood glucose levels.
People with metabolic syndrome like type 2 diabetes are sometimes required to monitor the glycemic load of the food they eat to help them manage their blood sugar spikes.
Glycemic load is based on the glycemic index and is calculated by multiplying the grams of available carbohydrate in the food by the food’s glycemic index, and then dividing by 100.
The glycemic index of common foods is retrieved from the International Tables of Glycemic Index and Glycemic Load Values (1).
Then we use our in-house machine learning algorithm to estimates the missing values.
Our algorithm is based on an improved version of the Food and Agriculture Organization (FAO) model (2).
Then to further improve the accuracy of this model, we use a statistical method called stacking by combining the predictions from the improved FAO-model with a LASSO model based on food nutrient content.
FODMAPs means fermentable oligosaccharides, disaccharides, monosaccharides, and polyols.
They are carbohydrates that are poorly absorbed by the small intestine. They include fructans, galactooligosaccharides, lactose, fructose and sugar alcohols.
In most people, FODMAPs help prevent digestive discomfort because they have a positive impact on the gut flora (3).
But in some people with irritable bowel syndrome (IBS), FODMAPs may cause digestive discomfort although they do not seem to cause intestinal inflammation. In this case, a low-FODMAP diet might help to improve digestive symptoms (4).
It’s important to note that reducing FODMAPs on the long-term may have a detrimental impact on the gut microbiota (5).
First, FODMAPs data is retrieved from the Monash University database and research papers focusing on fructans and galactooligosaccharides (6).
Then we estimate the missing data using a LASSO algorithm based on our complete food nutrient composition database.
The data we use is based on the US Food Composition Database (USDA), which is one of the most complete scientific databases. We then also use the Canadian (CNF) and European databases to cross reference and ensure we have a high level of completeness.
Following the first step, around 15 % of the data is still missing and 5 % have inaccuracies.
So to fix that, we use our state-of-the-art machine learning algorithm that we describe below.
It is then we finally have a complete database with 91 nutrients for about 4000 foods.
A database with 4000 foods may seems small compared to other Apps that advertise “millions of foods”. This is because they contain branded and industrially processed foods.
It creates two major problems:
First, most people use a diary app to track and improve their diet. Given that one of the major problems with our modern diet is the overconsumption of industrially processed foods and especially ultra-processed foods, we made the choice to not include them.
Additionally, brands are only required by law to display 7 nutritional values (fat, saturated fat, carbohydrate, sugars, protein, salt, and calorie). The problem here is that, statistically, 7 nutrients is not enough to assess food quality, you need at least the essential 30 micronutrients. So, when an App let you log food that only show a couple of nutrients, you risk missing important data as explained below.
To explain why data quality is that important, let’s take an example:
During your day, you eat two types of food which are high in selenium: 6 Brazil nuts and 5 sardines. For an average adult, it’s 800 % of the recommended dietary allowances and 10 % above the tolerable upper intake level for selenium.
Knowing that you’re 10 % above the tolerable upper intake level, you would reduce your consumption of Brazil nuts or sardines. However, if the app is missing the selenium content for Brazil nuts, you will only see that you reached 58 % of the recommended dietary allowances instead of 800 %. That would mislead you to consume more sardines or other selenium rich food to reach at least 100 %, therefore, put you at even higher risk of selenium overconsumption.
To avoid this issue, diary apps have two possibilities:
The first possibility is to show an empty value for the total amount of nutrient consumed during the day. Knowing that the USDA database has around 15 % missing values and that we consume, in average, more than 12 different food per day, most values would be missing.
The second possibility is to estimate the missing value. The easiest and widespread method is to replace each missing value with a 0 or the average. However, it introduces significant error and create the same problem seen in the above example. In our App, we use our state-of-the-art machine learning algorithm to avoid these pitfalls and bring you the best data quality.
Missing data is a common problem in nutrition. This could be due to different reasons: measurement error, deleted aberrant value or simply lack of analysis.
Food composition databases with missing data have limited usage because dietary assessment can be performed only on a complete dataset.
A common solution is to use means or medians imputation, but it introduces significant error. Even worse, most diary app use zero imputation.
We tested the following imputation methods:
We selected SGC-LASSO that outperformed all the other methods.
In food composition databases, data can be grouped in food groups. For example, all varieties of rice share similar nutritional proprieties. We therefore incorporate this information into our algorithm by using Sparse Group Lasso.
Additionally, nutrients are not independents to each other’s. For example, the sum of all amino acids is less than or equal to the sum of total proteins, the sum of simple sugar is less than or equal to the total sugar amount which in turn is less than or equal to the total carbohydrates amount, etc….
To incorporate this information, we add multiple inequality constraints to the Sparse Group Lasso.
We solve SGC-LASSO with convex optimisation under constraints and use cross validation for penalization factors selection.
What and how we eat have a major impact not only on our health but also on the environment. A third of global human-caused greenhouse gas emissions comes from our food system.
Livestock plays an important role and contribute to 14.5 % of total emissions with almost two third of it coming from beef and milk production (7).
Another major sector is food processing, distribution (transport, packaging, retail) and end-of-life disposal, accounting for a tenth of total emissions (8).
It represents around five times the global emissions from aviation (9).
Greenhouse gas emission is only one part of the problem: our food system contributes to many other types of pollution including the release of plastic and toxic chemicals into the environment.
By eating less processed food, reducing meat and dairy consumption and sourcing locally, you we have a positive impact on the planet.