Special report: AI Institute for next generation food systems (AIFS)

Artificial Intelligence (AI) has the potential to transform US food systems by targeting its biggest challenges: improving food yield, quality, and nutrition, decreasing resource consumption, increasing safety and traceability, and eliminating food waste. Despite big leaps in AI capacity, food systems present several challenges for the application and adoption of AI: (1) Food systems are highly diverse and biologically complex, (2) ground-truth data is sparse, costly, and privately held, and (3) human decisions and preferences are intricately linked to every stage of food system supply chains. To address these challenges and transform U.S. food systems, the AI Institute for Next Generation Food Systems (AIFS) aims to develop AI technologies and nurture the next generation of talent to produce and distribute more high- quality nutritious food with fewer resources. AIFS has six research clusters, including two Foundational Research Areas (Use-Inspired and Foundational AI, and Socioeconomics and Ethics) and four Application Research Areas spanning the entire food supply chain: Molecular Breeding, Agricultural Production, Food Pro- cessing and Distribution, and Nutrition. AIFS is developing generalizable, data efficient, and trustworthy AI solutions based on a knowledge-driven and human-in-the-loop learning paradigm designed to handle food system diversity and biological complexity, efficiently capture, and utilize food system data, and garner user trust via explainability, safety, privacy, and fairness.


Introduction
Artificial Intelligence (AI) has the potential to transform US food systems by targeting its biggest challenges: improving food yield, quality, and nutrition, decreasing resource consumption, increasing safety and traceability, and eliminating food waste. In the last decade, scientists and engineers have made significant headway in developing and deploying tools and devices that deliver a massive, yet too often raw, data stream to food system stakeholders at unprecedented spatiotemporal resolution. At the same time, AI algorithms repeatedly break benchmarks in computer vision, natural language processing, and automation, while AI-optimized hardware is enabling major advances from robotics to consumer electronics.
A primary mission of the AI Institute for Next Generation Food Systems (AIFS) is to develop AI technologies for a sustainable food system and to nurture next generation talent to produce and distribute nutritious food with fewer resources. In the coming decades, AIFS aims to help transform US food systems by innovating AI technology that will generate actionable information for diverse stakeholders in food system supply chains, grounded in a robust ethical and socioeconomic framework. Toward this goal and addressing the above challenges, AIFS will develop generalizable, data-efficient, and trustworthy AI solutions to enable (1) Molecular breeders to discover and/or design the next generation of high yielding, high-quality, consumer-focused foods, (2) Agricultural producers to maximize food quantity and quality, while minimizing resource consumption and waste, (3) Food processors and distributors to deliver highly traceable and safe food, while minimizing resource consumption and waste, and (4) Consumers to rapidly and precisely assess the nutrition of a meal, quantify the food's molecular composition, and predict the impact on their health. AIFS will build these solutions using knowledge-driven and human-in-the-loop learning paradigms designed to handle food system diversity and biological complexity, efficiently capture, and utilize food system data, and garner user trust via explainability, safety, privacy, and fairness. Today, when AI is employed by food system researchers, engineers, and industry leaders, it is nearly exclusively as a technological byproduct of other industries. By creating food system-specific AI solutions, AIFS will accelerate AI's capacity to positively transform US food systems and impact stakeholders across the supply chain.
AIFS has six research clusters, with two foundational research areas (Use-Inspired and Foundational AI, and Socioeconomics and Ethics) and four application research areas (ARA) (Molecular Breeding, Agricultural Production, Food Processing and Distribution, and Nutrition) in addition to programs in Education, Outreach, and Workforce Development (EOWD), Broadening Participation and Collaborations and Knowledge Transfer (Fig. 1). Application research areas span the entire food system. The Use-Inspired and Foundational AI cluster connects the six research clusters and develops AI tools through close communication and feedback cycles. Social, economic, and ethical considerations will be integrated into the application of AI in all four applied research areas. AIFS also actively engages academic, stakeholder and public audiences through education, outreach, and broadening participation activities. The overall vision of AIFS is to address challenges in both foundational and use-inspired AI research, train the future AI workforce, and address some of society's grand challenges across the food system. AIFS has brought together researchers from six institutions ( with a proven record of excellence in AI and food system science, engineering, outreach, and education. AIFS currently engages 50 + faculty members and researchers, 40 + graduate students and postdocs, and 18 undergraduate fellows. It has established its scientific, education, and outreach advisory board, industrial board, and stakeholder board. AIFS serves as a national nexus point for collaborative efforts spanning higher education institutions, federal agencies, industry, and nonprofits/foundations.

Foundational AI research
AI and data-driven computational methods are the underlying fabric that connect the application research areas of AIFS. The objectives of this research area, in a logical progression of effort, are as follows: (a) identify key common challenges that underlie the entire pipeline of the food system; (b) establish theoretical frameworks within which these challenges can be systematically addressed; (c) develop use-inspired methods and algorithms that can be refined and extended to take into account the specifications and domain knowledge of each of the four application areas; (d) establish foundational principles and understandings that are salient in an AI-enabled agricultural science and generalizable to other scientific fields. AIFS seeks to balance foundational research and agricultural application-specific solutions through a principled and systematic investigation that tackles critical challenges inherent in the food system. Challenges: Inherent challenges in an AI-enabled agricultural science are rooted in three salient features of the food system: (1) high variability and diversity in terms of crop traits, environmental conditions, multi-faceted quality measures, and consumer preferences; (2) high cost-in terms of both labor and time (e.g., the innate growth cycle of crops)-associated with data collection and the low quality of observational data (e.g., self-reported dietary intake data); (3) the complex human factor dictated by the primary ties between humans and food-a successful adoption of AI in the food system hinges on human trust and response to AI applications.
The first challenge gives rise to a highly complex learning space that all AI solutions need to navigate through: high dimensional input and output, reward, and loss as feedback for adaptation and learning are difficult to define (e.g., the taste of a strawberry variety), and highly nonlinear and nonconvex objective function landscapes. Compounding this difficult learning task is the second challenge that constrains the AI models with few, noisy, and incomplete data points from which to learn. The third challenge further complicates the problem by demanding complex design constraints in terms of safety, fairness, privacy guarantees and understanding the socioeconomic consequences.
Approach and Theoretical Frameworks: To address these challenges, AI solutions need to be generalizable, trustworthy, and data efficient. The model must be effective in simultaneously addressing the challenges of high variability in the learning space and offset the high cost associated with data collection. Trustworthiness implies providing safety, fairness, and privacy guarantees and being mindful of socioeconomic consequences. Data privacy enables data sharing across supply chains to address the challenge in data availability. Data efficiency pertains to the effectiveness of utilizing limited and noisy data sources.
To build generalizable, trustworthy, and data-efficient AI solutions, our overarching approach rests on a knowledge-driven and human-inthe-loop learning paradigm that allows active and real-time interactions between human and machine. This paradigm allows for building trust, obtaining subjective labels (e.g. sensory and flavor attributes) and for constructing reward/loss functions.

Ethics research
To achieve its goal of developing AI tools to transform US food systems by targeting its biggest challenges, AIFS will require a clear ethical framework to guide the research and the researchers. This framework aims to assure socially responsible and trustworthy AI for agricultural applications. We aim to create a meaningful, transformative ethics framework that goes beyond what has been described by some AI scholars as "ethics-washing" to instead anticipate ethical standards and protocols that may be needed to keep pace with AI technologies.
A clear ethical framework underlies a successful AI tool by demonstrating transparently what researchers and developers ask stakeholders to trust them with, how they will use it, and why their work warrants the trust of others. Moreover, social, ethical, and economic barriers may hinder successful deployment of AI tools. Understanding these barriers and how to overcome them in a way that improves societal welfare is crucial, not only for AIFS but for AI more broadly.
Food is fundamental to the human experience. It plays a critical role in social interactions and personal wellbeing. Food preferences are embedded deeply in identity, emotions, and culture. For these reasons, concerns regarding AI in the food system are often personal, which makes trust fragile. Earning and maintaining trust is central to ethical AI, and its fragility in this setting makes it even more important to prioritize. For researchers and AI developers, data is the key resource in AI development and deployment. Researchers negotiate with industry participants over the terms to obtain and transfer data. They then leverage those data to develop and deploy AI tools. These negotiations and the resulting AI tools raise potential ethical challenges, including the risk of loss or injury from tools that fail to achieve their objective, the incentive to rush to publish or deploy tools prematurely, insecure, or unfair use of data, and inequitable effects on third parties such as small farmers, laborers, or low-income consumers. These challenges can be met by adherence to the principle that AIFS researchers seek to develop tools for which expected benefits outweigh the risks and for which the benefits and risks are shared equitably. Other important characteristics of ethical and trustworthy AI are transparency, vigilance, and clear communication.
We begin with three projects. One project will generate a set of best practices that AIFS and its researchers can adopt to help assure the trustworthiness of their research. The second will create an ethics curriculum for AIFS researchers, graduate students, and post-doctoral fellows, with several foundations including Deep Questions (Yarborough and Hunter 2013). The third will study and survey stakeholders in industry, labor, and the policy community to ascertain the social, economic, and ethical challenges in deploying AI tools successfully in the food system. We will work collaboratively with AIFS, its researchers, and its partners.

Molecular breeding
The molecular breeding cluster focuses on developing AI tools for breeding the next generation of high yielding, high-quality, consumerfocused varieties of vegetables, fruit, and nut crops. We aim to address the following three challenges unique to horticultural crop improvement: 1) The diversity of horticultural crops requires highly specialized breeding approaches. Specialized tools for breeding developed in one species do not necessarily perform well in another. 2) Yield data is collected by hand, incurring high labor costs. 3) Fruit and vegetable quality is multi-faceted and is subject to context-dependent consumer preferences whereas existing tools for AI-enabled breeding are best suited for a single trait target (e.g. yield). Furthermore, there are often tradeoffs between quality and yield which necessitates breeding for both traits simultaneously.
Approach: Building on the developments in the AI cluster we aim to develop AI methods that are explainable to breeders, that are contextaware to adapt to consumer preferences, and that leverage data integration to take full advantage of the wave of automated high-throughput phenotyping technologies currently being applied to diverse breeding programs.
1) To address the challenge of the diversity of horticultural breeding programs we will develop multi-model algorithms that flexibly integrate genomic and phenotypic data (common to all crops) with domainspecific knowledge from breeders. By developing Explainable AIalgorithms, we will generate predictions that can be interpreted and vetted by breeders.
2) To address the challenge of quantifying yield in horticultural crops we will develop multi-modal AI algorithms that can integrate the heterogeneous and high-dimensional data from high-throughput phenotyping technologies such as mobile or stationary hyperspectral cameras, video-imaging and 3D modeling, to predict yield throughout the season. Accurate predictions of yield from automated sensors will greatly reduce the labor costs of breeding programs, enabling more varieties to be tested simultaneously in more locations to better model the interaction between genotype, environment, and management.
3) To address the challenge of improving crop quality and consumer preferences we will develop AI-architectures that can leverage multimodal data to identify predictive features for consumer preference and use these features to select improved varieties. Because deep learning algorithms allow end-to-end prediction (e.g., from genetic and molecular data to the overall consumer appeal), it allows us to optimize breeding more effectively with multiple objectives, including subjective qualities such as consumer preference which can depend on flavor, smell, appearance, texture, and/or nutrition, among other factors.

Ag production
Agricultural production requires substantial inputs (e.g. water, fertilizer, pesticides, energy, and labor) to maximize the output of food quantity and/or quality. Agricultural production is extremely diverse in terms of environmental conditions, crop traits, and management strategies.
The AIFS Agricultural Production cluster is focused on developing AI tools that enable agricultural producers to sustainably manage the diversity of horticultural cropsmaximizing food yield and quality, while minimizing resource consumption and waste. Specifically, we aim to address the following three challenges associated with agricultural production: Highly variable production conditions: Crop monitoring, forecasting, and mechanization is highly site-specific due to variability in crop traits, pathogen pressures, environmental conditions, and management strategies making technological generalization very challenging. We are developing crop-generalizable AI frameworks that integrate multimodal sensor data, mechanistic crop modeling, and robotic controls for precision agricultural management. We are building machine learning models that integrate the large existing knowledgebase of plant biologists, crop modelers, and agricultural producers. First, we have focused on using 3D biophysically based crop models to generate a large number of synthetic datasets that form the basis of transfer learning for inference, or additional fine-tuning, on real sensed data. Second, we are building digital twin technologies, integrating 3D crop and robotic simulation models, to train deep reinforcement learning models for autonomous navigation and implement control (e.g., irrigators, fertilizers, pesticide applicators, pruners/thinners, and harvesters). Finally, we are developing novel deep learning architectures for predicting yield, quality, resource consumption, and waste generation capable of handling multi-modal model inputs with respect to signal type (e.g., pressure, visible/thermal/microwave radiation, electrical conductivity, etc.) and spatial-temporal scale (i.e., from mm to km).
Low/no Internet connectivity: Agricultural production technology faces unique constraints as it often occurs in remote areas with low to no internet connectivity, limited memory, and limited power supply. To overcome this challenge, we are advancing energy and memory-efficient sensing hardware and algorithmic systems for high-performance edge AI in agricultural environments. We are working to engineer new agricultural sensor systems that integrate recent innovations in AI dedicated microprocessors, such as visual processing units (VPUs), tensor processing units (TPUs), and other types of AI accelerators. As we develop agriculture-specific deep learning architectures it is critical that we optimize them to run on AI-dedicated microprocessors which can run in low power, lower memory systems.
Producer confidence: Our AI system could provide an end-user a list of actionable factors, e.g., irrigation and nutrients, as well as their contributions to the yield. It can also map out the causal relationships among multiple variables of interest and allow the user to ask questions in terms of counterfactual scenarios, e.g., climate conditions and management practices not present in the training data set.

Food processing and distribution
The key challenges in the food processing and distribution are food safety, food loss and spoilage, and process innovation/optimization. To address the challenges of food safety, we will develop AI models that can flexibly integrate the existing food microbial ecology, chemometric and physical data sets for comprehensive assessment of food safety risks from farm to retail distribution. These existing data sets will be supplemented with digital twin models of food processing operations including sanitation and food handling and transport to simulate transfer of pathogens between food and its environment including humans and food contact surfaces, such as an agent-based model we have developed previously [Zoellner et al 2019]. Together these data sets will create food safety scenarios for both training and validation of AI models. To develop human confidence in AI predictions, food systems wide AI models will also be tested against prior food safety outbreaks using data sets collected by national outbreak reporting systems (NORS), FoodNet and other public databases. To explain AI predictions, we aim to develop an interface that enables AI to explain the models and the output decisions using natural language sentences and data visualization approaches. In addition to predicting food safety risks, AI models will also be developed to optimize resource utilization (energy, water, and chemicals) and efficiency of various operations designed to promote food safety and minimize risks of outbreaks such as sanitation of food contact surfaces.
To address the challenges of food loss, we will develop AI models that flexibly integrate microbial, physico-chemical and market data sets to predict food loss. The microbial and physico-chemical data sets will indicate the spoilage risks while the market data will predict consumer aspects including behavior and needs about food loss. We intend to integrate market data and data generated using digital twin models, such as simulation of plant respiration and growth of spoilage microbes during storage and retail display.
To address the challenges of process innovation and optimization, we aim to develop AI models to predict outputs of food processing operations and to optimize input resources including energy and water for food processing. These AI models will integrate datasets from various mechanical, thermal, and chemical inputs during food processing and their influence on food products. These AI models will predict the product quality outputs such as texture, color, and flavor of a selected product. To generate datasets for AI models of complex processing operations, we will develop digital twin simulations of food processing operations. Digital twin models also enable simulation of variability and diversity in the input conditions such as diverse fresh produce with variable farm residues for simulating washing and sanitation of fresh produce. These datasets simulating product and process variability will enable development of adaptive AI models. Furthermore, to reduce the operating failures in the food manufacturing industry and enable instant responses with feed-forward controls of production operations, these approaches will be enabled by combining data sets from diverse sensors and developing AI enabled predictive models to optimize process conditions and product quality in real time.

Nutrition
The endpoint of the food system is nutrition-the consumption of food to sustain human life and, preferably, to enhance health and well-being. AI technologies are advancing the field in several areas. AI/ML have been used to assess diet via food photography. Many challenges remain. Large-scale controlled feedings studies are prohibitively expensive and burdensome. Instead of being required to specify everything eaten and the quantities, what if participants just take a picture of a plate of food? Our team is currently conducting the Surveying Nutrient Assessment With Photographs of Meals (SNAPMe) Study (ClinicalTrials.gov) to prepare this benchmark dataset, which can then be used to evaluate the application of computer vision algorithms to food photos for the purpose of dietary assessment.
Once a human participant's food intake is known, those foods are translated to nutrients using food composition tables. While it is not feasible to analyze the composition of every food item, it should be possible to build models from labelled data sets to predict the composition of new foods. Our team is currently preparing the labelled data sets necessary to build prediction models for the glycan composition of foods. Little is known about the glycans in foods even though they are the primary carbon sources for our gut microbes. The project is an essential step towards determining what people should eat to nourish the right gut microbes.
The overall framework can be extended to other molecules. Much of the nutrient content of food is currently "dark matter" (Barabasi, Menichetti, and Loscalzo 2020) that does not yet exist in the USDA food composition tables accessed by dietary intake apps. Meanwhile, the analytical technologies needed to completely characterize the "nutriome" -all the compounds in food-are rapidly progressing. Each food ingredient potentially contains thousands of small molecules quantified and catalogued in the FooDB database (FoodB.ca). Other "omes" of food constituents such as lipidomes (all of the fats), proteomes/peptidomes (all of the proteins), etc. have yet to be fully characterized, although such technologies exist today. When the complete molecular characterization of food is incorporated into food composition databases and integrated with data on the effects of these foods via cell models, animal models, or human feeding studies, this integrated data set can form the basis of clinical trials or for experiments with digital twins, or models. Results from experiments will then be aggregated into knowledge graphs which enable scientists to interrogate the information to translate it into new dietary guidance. In the future, this guidance will be both personalized-pertaining to individual people-and precise-recommending particular foods or varietals, rather than general food groups.

Education, public Engagement, and workforce development
Innovations in research are complemented by transformative and inclusive education and public engagement approaches to nurture the next generation of talent in a diverse workforce, as well as comprehensive initiatives to broaden societal engagement including knowledge transfer and collaboration. AIFS nurtures the next generation of talent to enable a more resilient and productive society. AIFS aims to improve access, awareness, and interest amongst K-14 audiences, including nontraditional and underrepresented student populations; increase the number of highly-competent AI-trained and skilled new workforce entrants across food and agriculture sectors and disciplines; implement effective industry and government partnerships to accelerate market adoption of AI food and agriculture technologies; and incorporate AI into existing outreach programs that train students and postdocs to more effectively engage with the public.
To that end, AIFS launched its Career Exploration Fellowship program, which aims to prepare undergraduate students from diverse backgrounds for careers at the intersection of food, agriculture, and technology. This program pairs college students with companies, nonprofits, and AIFS-affiliated university labs to work on exciting projects that are addressing critical challenges in food and agriculture using technology.
Another significant component to nurturing next-generation talent is through training for graduate students and postdoctoral fellows. AIFS actively engages academic, stakeholder and public audiences through education, outreach and broadening participation activities (e.g. roundtables, seminars, panel discussions) led by graduate students and postdoctoral fellows, who receive training on effective science communication. Additional training is offered through AIFS workshops led by UC ANR, a statewide UC network of over 1,500 academics and staff with the mission to transfer science and technology to the people of California to inform and train industry professionals on the application of AI technologies.
To equip the next generation of students with the skills and knowledge necessary for high-tech agricultural innovation, AIFS has also developed 21 educational modules with more on the way in future years. These modules include topics in data science, machine-learning, modeling, and simulation technologies. These module offerings will be expanded to cover more disciplines and will comprise a curriculum that spans high school, community college, 4-year undergraduate programs, graduate school, through postdoctoral training.

Building a strong organization
We have established an institutional organization structure that ensures business continuity, coverage of the broad spectrum of interests within the institute, provides access and advisory capacity to a panel of external experts, and adheres to the principles of inclusion, transparency, and meritocracy of AIFS.
With an eye toward making a noticeable positive impact on the food system, AIFS is following a 5-year plan, which ultimately delivers scaledup and translated AI technology to the food system as shown in Fig. 2.
We see significant opportunities as well as challenges in the Institute activities. We present our SWOT (strengths, weaknesses, opportunities, threats) analysis as follows. Our key strengths are that we have multiorganizational connections already established and research projects which were able to hit the ground running. Additionally, there is considerable engagement between researchers and between staff and researchers. Among our weaknesses are the potential of spreading funding too thin for maximum effectiveness. There also needs to be focus on AIFS projects by all researchers, among other competing interests. We see some opportunities amidst a challenging landscape. With water and labor shortages and food cost increases, demand for technologybased solutions including those with AI foundations will increase. Additionally, many businesses are already looking to AIFS for leadership and authoritative answers. The threats we have identified include the potential of competing against narrow bands of venture capital in some areas.

Discussion
AIFS aims to develop food system-centric AI solutions for transforming productivity, sustainability, and safety of food systems as well as enhancing consumer health and wellness. These AI solutions will innovate algorithms and computational resources to model both diversity and biological complexity of food systems, address key knowledge gaps in ground truth data, and create explainable and trustworthy predictions to engage humans in-the-loop. These innovations are significantly and intellectually distinct from the current scenario where AI approaches in food systems are exclusively technological by-products of other industries. By investigating and creating food system-specific AI technologies, AIFS will accelerate AI's capacity to positively transform US food systems and impact stakeholders across the supply chain. AIFS has bought together researchers from six institutions with a proven record of excellence in AI, food system sciences, and engineering. The research plan investigates original and transformative concepts at the intersection of foundational and application research areas that span the entire food system. Critically, AIFS institutions represent leaders in AI innovation and agriculture and food systems research with significant resources including state-of-the-art compute, molecular sequencing, analytical, greenhouse, crop production, and engineering facilities as well as stakeholder engagement to enable success and transformative impact on society. Serving as a national nexus point for collaborative efforts spanning higher education institutions, federal agencies, industry, and nonprofits/foundations, AIFS will accelerate the translation of AI innovations into the food system and nurture the next generation of talent to enable a more resilient and productive society.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.