Ultra-processed foods: how functional is the NOVA system?


Abstract

Background

In the NOVA classification system, descriptive criteria are used to assign foods to one of four groups based on processing-related criteria. Although NOVA is widely used, its robustness and functionality remain largely unexplored. We determined whether this system leads to consistent food assignments by users.

Methods

French food and nutrition specialists completed an online survey in which they assigned foods to NOVA groups. The survey comprised two lists: one with 120 marketed food products with ingredient information and one with 111 generic food items without ingredient information. We quantified assignment consistency among evaluators using Fleiss’ κ (range: 0–1, where 1 = 100% agreement). Hierarchical clustering on principal components identified clusters of foods with similar distributions of NOVA assignments.

Results

Fleiss’ κ was 0.32 and 0.34 for the marketed foods (n = 159 evaluators) and generic foods (n = 177 evaluators), respectively. There were three clusters within the marketed foods: one contained 90 foods largely assigned to NOVA4 (91% of assignments), while the two others displayed greater assignment heterogeneity. There were four clusters within the generic foods: three clusters contained foods mostly assigned to a single NOVA group (69–79% of assignments), and the fourth cluster comprised 28 foods whose assignments were more evenly distributed across the four NOVA groups.

Conclusions

Although assignments were more consistent for some foods than others, overall consistency among evaluators was low, even when ingredient information was available. These results suggest current NOVA criteria do not allow for robust and functional food assignments.

Introduction

There is increasing evidence that foods are not the simple sum of their nutrients [1]. When characterizing foods, it is essential to consider factors such as processing and formulation, which have grown more and more complex over the years. Whether carried out within households, artisanal settings, or factories, food processing aims to ensure product safety, digestibility, and palatability. It also seeks to improve shelf life and simplify meal preparation [2]. Human diets are progressively incorporating larger quantities of industrially processed foods [3]. At present, several systems are used to classify foods according to processing-related criteria [4,5,6,7,8,9,10] each employing different criteria and metrics.

NOVA is, by far, the most common of such systems [9]. Its stated purpose is to classify “all foods according to the nature, extent, and purposes of the industrial processes they undergo” [10]. In the NOVA system, foods are assigned to one of four groups: (i) NOVA1 contains “unprocessed or minimally processed foods,” namely the edible parts of plants or animals that have been taken straight from nature or that have been minimally modified/preserved; (ii) NOVA2 contains “culinary ingredients,” such as salt, oil, sugar, or starch, which are produced from NOVA1 foods; (iii) NOVA3 contains “processed foods,” such as freshly baked breads, canned vegetables, or cured meats, which are obtained by combining NOVA1 and NOVA2 foods; and (iv) NOVA4 contains “ultra-processed foods,” namely ready-to-eat industrially formulated products that are “made mostly or entirely from substances derived from foods and additives, with little if any intact Group 1 food” [9].

Nutritional epidemiologists are increasingly using NOVA to explore relationships among the consumption of highly processed foods and diet quality or health outcomes. Indeed, NOVA was used in 95% of the studies on this topic published between 2015 and 2019, and which have been included in a recent systematic review [11]. Furthermore, policymakers are moving to use NOVA assignments to guide public health decisions. For example, several Latin America countries have constructed dietary guidelines based on using NOVA [12, 13], and the French government is drawing upon NOVA in its objective to reduce ultra-processed food consumption by 20% [14].

It thus seems likely that NOVA will be employed in an ever-broader range of contexts. Nevertheless, aside from some sparse past work [15, 16], the system’s robustness, functionality, and consistency remain poorly characterized. Because its classification approach is purely descriptive in nature, it opens the door to ambiguity and differences in interpretation [17]. Indeed, even experts face difficulties and have disagreements when employing it [18,19,20].

Here, we explored the robustness and functionality of the NOVA classification system by determining whether a large number of food and nutrition specialists arrived at consistent food assignments when applying the system’s criteria. We also differences in assignments among evaluators and the relationships between NOVA assignments and food nutritional quality based on known nutrient profiling systems.

Discussion

In this study, we explored the robustness and functionality of the NOVA classification system by asking food and nutrition specialists to implement the system as intended by its creators [9]. We had them assess a list of marketed foods and a list of generic foods commonly consumed in France. The most striking result was that evaluators were inconsistent in their assignments, regardless of professional background; the mean values of Fleiss’ κ never exceeded 0.34. Many foods were not consistently assigned to the same NOVA group. In particular, the HCPC analysis indicated that assignments were highly heterogeneous for 30 marketed foods (25% of the total) and 28 generic foods (25% of the total). Finally, we found that an appreciable percentage of the foods commonly considered to be ultra-processed (NOVA4maj) were of acceptable nutritional quality.

To date, only one previous study has addressed similar questions. It found that reliability between two evaluators was lower with the NOVA system than with two other similar classification systems [16]; it highlighted that the risk of misclassification was higher when using NOVA probably because its four groups are not clearly defined. We found support for this idea using a much larger number of evaluators.

Surprisingly, providing detailed ingredient information did not improve evaluator consistency nor did it affect evaluator confidence levels. The latter were high or very high for most of the assignments, whether or not ingredient information was present. This result suggests that evaluators relied on their own knowledge or subjective feelings about the foods when making their assignments.

Some foods had a wider range of assignments. For instance, plain unsweetened dairy products were assigned to all four NOVA groups (cluster V; Table 3). This result may be tied to ambiguity in NOVA criteria [9]. On the one hand, “yogurt with no added sugar or artificial sweeteners” is specifically cited as an example of a NOVA1 food. On the other hand, it is clearly stated that non-alcoholic fermentation, the process by which yogurt is made (i.e., lactic fermentation), is characteristic of NOVA3 foods. It is further mentioned that “substances […], such as casein, lactose, whey”—ingredients often present in yogurt—are “only found in ultra-processed products,” meaning NOVA4 foods [9]. It was equally hard to determine whether other foods belonged in NOVA3 or NOVA4 (e.g., see cluster U; Table 3). One source of uncertainty is the indication that ultra-processed foods (NOVA4) “are industrial formulations, typically with five or more and usually many ingredients,” which may have led some evaluators to assign foods with long ingredient lists to NOVA4, even if they did not contain ingredients typical of NOVA4, such as “substances not commonly used in culinary preparations.” This reference to culinary versus industrial processes for preparing foods may also have led to misunderstanding: for example, corn starch is not necessarily a common ingredient in French households, which may have resulted in corn starch-containing foods being assigned to NOVA4. However, corn starch is also cited as an example of a NOVA2 item [9], and its presence may thus have led to NOVA3 assignments as well. Finally, uncertainty can stem from the processing procedure. For example, popcorn cakes may be treated as a food that has undergone extrusion cooking, leading some evaluators to arrive at a NOVA4 assignment. However, other evaluators may have opted for NOVA3 instead, given the food’s simple ingredient list. Overall, NOVA appears to be overly reliant on non-hierarchical criteria that cannot be applied rigorously and systematically in the absence of an unambiguous decision tree.

In the case of the generic foods, evaluators had no information about ingredients. When considering the 28 generic mixed dishes, two-thirds of the assignments were NOVA4, likely because evaluators assumed the foods had been industrially produced. Other evaluators arrived at assignments of NOVA3, likely assuming the foods were homemade. Interestingly, we observed the same ambiguity surrounding the generic yogurts and fromage frais as for the marketed dairy foods, underscoring the contradictory criteria of NOVA regarding these products. We also discovered that foods perceived to be industrial in nature were more likely to be assigned to NOVA4. For example, evaluators largely placed commercial orange juice in NOVA4, even though “fresh, squeezed, chilled, frozen, or dried fruits” are mentioned among the NOVA1 examples [9]. Similarly, 70% of evaluators classified coffee as NOVA1. Coffee torrefaction is often industrial in nature, and food technology specialists view it as a high-impact process whose elevated temperatures lead to high acrylamide levels [34]. However, the NOVA system surprisingly allows torrefaction in association with NOVA1 foods because it is a traditional processing method. Similarly, the evaluators seem to have based their assignments of coffee on cultural rather than scientific knowledge, likely because this beverage is familiar and frequently consumed in France.

The definition of levels of food processing, as proposed by the NOVA classification, is complex and multidimensional [17]. It does not really reflect the intensity of the processes used, but is a mix of technological considerations based more on socio-cultural aspects than on physical-chemical ones occurring during food processing. Furthermore, NOVA criteria associates such so-called technological dimensions with formulation considerations, such as the use of some specific ingredients, or the number of total ingredients involved in the recipe. Separating the level of thermo-mechanical energy undergone by the raw material from the formulation of the food (and in particular from the addition of additives) could be a way for building a robust indicator of the level of food processing. Such an indicator of food processing could help for better understanding if the links observed between ultra-processed food consumption and health are mainly due to the food structure or to the food composition (specific ingredients and additives). For this, the construction of an analysis grid by major categories of unit operations required for food processing would be necessary. Understanding the links between the consumption of highly processed foods and health must necessarily integrate very interdisciplinary skills including food process engineering, food sciences, nutrition, and nutritional epidemiology.

Several studies have found that people who consume more ultra-processed foods (i.e., as defined by NOVA) have higher sugar and lower fiber intake; few differences exist from low ultra-processed foods consumers in their sodium, total fat, and saturated fat intake, but consumption of vitamins and minerals may vary [35,36,37,38,39,40,41]. Here, we found that foods most commonly assigned to NOVA4 (i.e., NOVA4maj foods) could vary substantially in their nutrient profiles, possibly in relation to the large heterogeneity of their composition. Thus, diet quality is more likely to be determined by specific consumer choices from among NOVA4 foods than by a food’s assignment to NOVA4 in and of itself. Confusing messages may arise from front-of pack labeling, such as when products with a NOVA4 label, signifying their ultra-processed nature, would also bear a label conveying their good nutritional quality (e.g., Nutri-Score “A”).

This study has limitations. First, regarding the choice of the food lists, there were only three food categories for the marketed foods and different results might have been obtained if other categories had been used. Our goal was to balance survey duration with reasonable food representation, allowing us to explore potentially conflicting NOVA assignments. Representativeness was greater for the generic foods, which were taken from the results of a population-level dietary survey. Second, while all the evaluators were French, they were specialists in human nutrition and/or food technology; their expertise should thus have served to overcome potential cultural bias and ensure the validity of our data set.

Third, the survey’s structure did not allow evaluators to modify earlier assignments, meaning they could not apply any understanding they developed as they advanced. Hypothetically, the ability to modify assignments could have slightly improved evaluator consistency, as could have allowing real-life discussions among evaluators. Finally, the representativeness of evaluators is also questionable.

However, our study also had several strengths. First, we obtained data from more than 150 evaluators for a total of 231 foods, yielding a much larger data set and more powerful statistical analysis than in previous studies [16, 42]. Second, extreme caution was taken to avoid influencing the evaluators in any way. Our online survey interface facilitated participation and created controlled study conditions. All evaluators were given the exact same information about NOVA, taken from the original publications. The survey set-up provided easy access to the description of the NOVA system, accessible in full detail or as a summary. The food lists and blocks appeared in a randomized order to avoid habituation bias.

Third, the vast majority of evaluators appeared to take the assignment process seriously. Just 7 evaluators (less than 2% of the total) failed to do so. The sensitivity analyses confirmed the robustness of our findings. Consequently, we feel confident that we obtained data from individuals who assessed the foods as best they could, given current NOVA criteria. All the evaluators were specialists in human nutrition and/or food technology and thus represent the body of individuals who may have to use NOVA in their professional lives.

NOVA “multidimensionnel”

Conclusions

Overall, our results suggest improvements should be made to the NOVA classification system to enhance assignment consistency. Indeed, we observed that a large percentage of the food assignments were discordant, regardless of whether ingredient information was provided. This finding raises questions about how functional NOVA is in its current form. It should also spur reflection on the reliability of conclusions from epidemiological studies that use NOVA as well as on NOVA’s ability to guide public health policy or provide useful information to consumers. While the concept of ultra-processed foods has certainly entered the consumer consciousness, our results indicate that NOVA criteria do not currently allow foods to be unequivocally defined as ultra-processed.