Global media as an early warning tool for food fraud; an assessment of MedISys-FF

Food fraud is a serious problem that may compromise the safety of the food products being sold on the market. Previous studies have shown that food fraud is associated with a large variety of food products and the fraud type may vary from deliberate changing of the food product (i


Introduction
The integrity of our food can be compromised by intentional food crime focused on deriving economic gain by manipulating food products in multiple ways, collectively referred to as food fraud (Manning & Soon, 2016).Food fraud may lead to food safety risks and may cause serious health problems to consumers (Mika et al., 2014;Pei et al., 2011;Stanciu, 2015).To combat food fraud, various approaches are applied, ranging from analytical testing to early warning systems, databases, and information exchange platforms (Bouzembrak & Marvin, 2016;Butler et al., 2021;Ulberth, 2020).An inquiry among representatives of the competent authorities of 20 EU Member States and from three European Commission services revealed that the development and implementation of early warning systems (EWS) was among the highest priorities to combat food fraud (Ulberth, 2020).The development of EWS is an important policy measure at individual business, national and international levels to mitigate against food fraud and to minimize potential public health impact (Manning & Kowalska, 2021).Media sources have been shown to provide early warning of food safety and food fraud incidents (Zhu et al., 2019).
Various databases that curate food fraud incidents have been developed such as the European Union (EU) Rapid Alert System for Food and Feed (RASFF), HorizonScan, EMA, and MedISys-FF.These have the potential for trend analysis and to function as data repositories that can provide signals that inform EWS (Bouzembrak et al., 2018;Rortais et al., 2010;Ulberth, 2020).In addition, in Europe, a monthly overview of food fraud cases, as collected from the media, is reported by the European Commissions' Joint Research Centre, and the EU Food Fraud Network publishes a yearly overview of food fraud cases considered by this network (Ulberth, 2020).Following its recent update, the online EU database RASFF does not publish open notifications on food fraud anymore, making the MedISys-FF system the only tool publicly available that reports this information online, free of charge, together with information on current food fraud incidents in Europe and beyond.
Previous reported research has used a data mining approach with MedISys to gather data on food and feed-borne hazards (Rortais et al., 2010) and food fraud (Bouzembrak et al., 2018).In this study, the food fraud articles collected by MedISys-FF over a 6-year time frame (2015)(2016)(2017)(2018)(2019)(2020) were analyzed to determine trends according to food product, countries and regions.Striking trends were observed for different food products and differences were seen between countries and regions.This study confirms earlier suggestions that MedISys-FF is a suitable tool for providing early warning signals which allow actors in the food system to decide when and where to focus their checks to combat illegal practices associated with food and to ensure it is safe to eat.

MedISys-FF tool
The food fraud dataset presented in this article was created using the MedISys-FF tool developed by Bouzembrak and colleagues (Bouzembrak et al., 2018).This tool uses the MedISys portal of the European Media Monitor (EMM), a system that uses text mining to collect media articles worldwide.To find articles about food fraud, the MedISys-FF tool utilizes a list of 531 food fraud related keywords.These keywords have been carefully selected based on the existing scientific literature and other previously available food fraud databases (e.g.RASFF, EMA), after which they were validated by experts and then translated into 8 different languages.The performance of these keywords in filtering out media articles specifically about food fraud was then assessed, and iterative improvements to these keywords were made until a stable level of 80% relevant articles was reached.

Automatic data collection
The MedISys-FF tool has been automatically collecting media reports from September 2014 onwards.Some key information from each article is presented on the MedISys filter such as a hyperlink to the original article, the country and date of publication, the title, and a small summary of the article.This information is only temporarily available (several days) on the EMM infrastructure, and to make analysis over longer time periods possible, this information is automatically stored at a local server at WFSR in a database.This database thus contains key information from all the media articles collected from the moment the MedISys-FF tool became operational in September 2014.

Expert classification
All the media reports collected by the MedISys-FF tool in this study have been first assessed by an expert on food fraud.In addition, duplicates (i.e., 83 duplicates out of 9508 articles) were removed.The outcome was further discussed with 2 other food fraud experts.The expert first determined which articles were correct hits and relevant, and subsequently determined the fraud type(s) and the product(s) involved, country and region.The correct hits are media articles in which the keywords have the intended meaning.For example, "fake food" is one of the keywords and a media report in which this keyword is used in the context of dilution or substitution of food, see (Spink & Moyer, 2011) for a wider categorization of food fraud, is a correct hit.Key types of fraud that were used to construct MedISys-FF (Bouzembrak et al., 2018) are synthesized in Table 1.
Media articles in which fake food refers to plastic foods used for display purposes (for example in restaurants) or reports on fraudulent food poisoning claims (food poisoning claims that were faked) were classified as incorrect hits.Media articles were defined as relevant when 1) determined as a correct hit, and 2) the article discussed a food fraud incident over the time frame of the search.Media articles on, for example, a conference about food fraud or unproven allegations of food fraud, were not considered relevant.Finally, for the articles determined as both correct hits and relevant, the type(s) of fraud and the product(s) involved were classified.The information from the MedISys-FF tool and the classifications added by the expert were subsequently stored in the database as well.Together this constructs the food fraud dataset presented and critiqued in this article.
MedISys-FF classifies the country of publication by a country flag connected to the media report at the MedISys-FF website.However, MedISys-FF did not provide the country of publication for all articles.Based on the source, the country variable which MedISys-FF extrapolates from the URL, we were able to determine the country of publication for most of the articles with a missing country variable.For the remaining articles we looked at locations mentioned in the title and summaries to make a judgement on where the paper was published.Using this approach, we were able to determine the country of origin for all food fraud articles except 8 articles left with the country of publication as unknown.It should be noted here that the country of origin of the article does not necessarily correlate with the country of origin of the food, as a media report from a given country may be reporting imported food.Nevertheless, the study of Bouzembrak et al. (2018) shows that more than 95% of food fraud reports reported food fraud in the country where it originated.

Network visualization
The titles and descriptions of all food fraud media articles were translated into English using Google Chrome's automatic translation tool.The relevant articles were analyzed using the bibliometric networks method "network visualization".Network visualization has shown to be a powerful approach to analyze a large variety of bibliometric networks, ranging from networks of citation relations between publications to networks of co-occurrence relations between words in a text (van Eck & Waltman, 2017).The network visualizations were built using an open access tool for network analysis VOSviewer, see (Bouzembrak et al., 2019) for a similar approach, and only the top 50 terms that were mentioned at least 50 times are presented.Furthermore, a network visualization (van Eck & Waltman, 2017) was developed for each of the most frequently identified fraudulent products (namely meat, milk, poultry, and wine) to determine the topics that are discussed about these products in the relevant food fraud articles.Thematic analysis is a method for identifying, analyzing, and reporting patterns (themes) within data (Braun & Clarke, 2006).The data was subjected to

Categories of food fraud (RASFF)
EMA database This category contains food fraud notifications classified in 6 different fraud types: (i) improper, fraudulent, missing, or absent health certificates, (ii) illegal importation, (iii) tampering, (iv) improper, expired, fraudulent or missing common entry documents (CED) or import declarations, (v) expiration date, and (vi) mislabeling.
H.J.P. Marvin et al. an iterative thematic analysis by compiling, disassembling, reassembling, interpreting, and concluding the analyzed data in order to draw out the findings and provide meaning (Castleberry & Nolen, 2018).

Results and discussion
Within the MedISys portal of the European Media Monitor (EMM), a filter (MedISys-FF) was created in 2014 to collect publications in the media world-wide on food fraud in 8 different languages (Bouzembrak et al., 2018).The efficiency of the filter was assessed for the period 2014-2015 and it was concluded that this filter complemented other systems (i.e.RASFF, EMA and HorizonScan) and provided useful additional information on food fraud for quality managers and food safety authorities to inform their control programs (Bouzembrak et al., 2018).In the present study, which builds on the previous study, we assessed the performance of MedISys-FF since it was launched (i.e. from 2015 to 2020) and analyzed the content of the articles collected.
In the period 2015-2020, 9508 articles were collected of which, following screening, 7416 were determined as correct hits concerned with food fraud.A large portion of these publications (n = 4375) were reporting food fraud incidents, which was the target of this study.These reports are all considered as relevant articles.The number of relevant articles is lower because the collected articles do not always address food fraud as intended in this study but does fit the filter design.The performance of the filter across the wider time frame is similar to that reported for the first year i.e. 74% and 58%, for correct hits and relevant articles respectively, (Bouzembrak et al., 2018).Although the filter contained keywords in 8 different languages, it collected correct hits in 41 different languages albeit the main language of the articles are in English (40.7%) and Arabic (40.6%), followed by French (5.1%),Spanish (4.8%), German (2.8%) and Portuguese (1.95%).The articles were published in 164 different countries, but the majority were derived from Egypt (14.2%), followed by the United States (10.6%), the United Kingdom (7.9%), Saudi Arabia (5.4%) and France (4.1%).When considering only articles addressing cases of food fraud (i.e.relevant articles), the majority of the articles came from Egypt (25.8%),Saudi Arabia (9.26%), Yemen (6.9%), Iraq (3.9%) and the United States (3.75%) (Fig. 1).
The very high number of reports in Egypt is puzzling.The majority of the reports deals with meat and meat products (41.7%) followed poultry meat products (11.8%) and most the cases were dealing with expiration date (83%).The reason for such high media attention in Egypt needs further investigation.On average 726 relevant articles were collated per year, but differences between years were observed with the highest number of food fraud articles in 2017 (e.g.1118).This elevated level was caused by multiple incidents and only a small proportion of the articles collected (n = 20) arose from the European Fipronil issue that occurred in that year (Nayak et al., 2022).Multiple articles discuss several products and types of fraud, hence more food fraud cases are reported than the frequency of articles.The products mentioned (n = 5022) are collated in Table 2.
Fraud with meat and meat products were most frequently reported (27.7%), followed by milk and milk products (10.5%), cereal and bakery products (8.3%), fish and fish products (7.7%) and poultry meat and poultry meat products (7.6%), which is similar to the findings reported earlier for the first year of MedISys-FF (Bouzembrak et al., 2018).High fraud with meat and meat products has also been observed in an earlier study in which a holistic Bayesian network model was developed to predict food fraud from 1393 cases and 15 different data sources (Marvin et al., 2016).A possible explanation for this might be that meat products are highly vulnerable to fraud due to their high nutritional and market values combined with the high fraud opportunities within the complex supply chain.Nowadays, meat is processed along the supply chain into various value-added products (e.g., sausages, burger patties, and deli meats), which has increased their vulnerability to fraudulent activities.For example, food processing techniques such as mixing or grinding often applied to meat, can make it easier to manipulate these products (Chuah et al., 2016;Esteki et al., 2019;Lianou et al., 2021;Zhang & Xue, 2016).
Most fraud incidents were related to product sold past its expiry date (58.3%) followed by product tampering (22.2%) and mislabeling of country of origin (11.4%).Fig. 2 shows the variation of the type of food fraud over the years analyzed and generally the order of reporting frequency remains the same.For all years, most of the food fraud cases reported in the media reflect problems with product expiration dates and tampering.
The other food fraud issues reported were absence of or inappropriate common entry document (CED), or health certificate (HC), illegal importation, incorrect origin labeling, tampering, theft, and resale.When analyzing the type of food fraud cases in the five countries which highlight the greatest number of cases of food fraud in the dataset, a high degree of similarity is observed.This is mainly because four of these Fig. 1.Number of relevant food fraud reports per country.
H.J.P. Marvin et al. countries are from the same region of the world (North Africa and the Middle East).Fraud in meat and meat products ranks first in four of these countries.Country differences are apparent, however, especially between different global regions.For instance in France, Italy and the United Kingdom, fraud with wine and alcohol beverages are more prominent in the dataset than in Middle Eastern countries.This seems logical due to cultural and religious differences between these two regions in alcohol consumption and whether illicit activity with alcohol would even be reported (Manning & Kowalska, 2021).Many trends can be seen in the collected data which vary per region and country.For example, the global trend of reports on fraud with fruit and vegetable shows a steady increase from 30 articles in 2015 to 81 in 2020, but differences are visible between regions (see Fig. 3).
Food fraud may lead to food safety risks and may cause serious health problems to consumers.Obviously, consuming food that has been frauded with expiration date may case illness due to spoilage (pathogen contamination) but also other fraud types may cause illness.Some examples that have been collected by MedISys-FF are shown in Table 3.
To obtain an overview of the associations between the characteristics of parameters related to food fraud and the framing of the supply chain and associated governance structures in the reported media in the dataset, all titles and descriptions of the reports were translated into English and used to generate a network visualization which is shown in Fig. 4. In the network visualization (Fig. 4), in general, terms are represented by their label and by a circle.The size of a circle of each word indicates the number of times a term has been mentioned, the higher the frequency of a term in the text, the larger the label and the circle of the term.The lines between the words represent links and the distance between them in the figure indicates the relatedness in terms of cooccurrence.The colour of a word is determined by the group to which the word belongs (van Eck & Waltman, 2017).
In Fig. 4, four different groups of words were thematically analyzed where they are mentioned together with municipality (i.e., green group), security (i.e., yellow group), supply (i.e., blue group), and product (i.e., red group) as being central words within these 4 groups.However, there is an integration between food governance and food supply within each group for example product-authority in the red group and supply-governor in the blue group.Some aspects appear in multiple groups in the network visualization e.g.municipality-trade, supply-internal trade, but the figure shows a wider systemic interaction.Focusing on the product word group (i.e., red group), the products being clustered in that group are milk (i.e., large circle) followed by fish, oil, honey, egg and fake alcohol and are associated with entities  impacted on or involved in the food fraud such as authorities, police, company, consumers and persons.Terms that co-occur a lot tend to be located close to each other such as fish, oil and bottle (Supplement 1, Fig. S1).More detailed figure on the red group is provided in the supplement section (Supplement 1, Fig. S1).
In addition, for each of the main fraudulent products (e.g., meat, milk, poultry, fish, honey, alcoholic beverage, wine and oils) a detailed network visualization was developed to determine the topics that are discussed related to food fraud in these articles.The analysis of the networks of the main fraudulent products resulted in the identification of several topics, showing various levels of relatedness to food fraud (Supplement 1).To illustrate this, the network of alcoholic beverage is highlighted (Fig. 5).Alcohol is known as one of the main fraudulent commodities and adulterants used or produced, such as methanol, can cause serious public health problems (Manning & Kowalska, 2021).
The network analysis based on the frequency of word co-occurrence identified 10 major groups, with the main nodes (death linked to counterfeit vodka) situated in the red group and highly connected with nodes from the other groups.Important terms in the yellow group are related to methanol poisoning which is at higher risk in countries where there is an illicit trade in illegal alcohol and alcohol substitutes, bath essence, bath lotion, and bath oil which are consumed by individuals who are unable to access alcohol in other forms (Manning & Kowalska, 2021).The common authorities and actions taken (food governance) are reflected in the purple and pink groups.Thematic analysis of the dataset highlights the type of food fraud reported, such as counterfeit vodka, fake vodka, fake booze, fake bottle, fake alcohol distributor, fake label, and counterfeit foreign brand, clustered in the different groups.More details on the words found in alcoholic beverage and in each group are summarized in Supplement 1.These terms align very well with the issues related to adulterated alcoholic as reported in the scientific literature (Manning & Kowalska, 2021), demonstrating the usefulness of such analysis.
The analysis of the networks of the main fraudulent products (Supplement 1) has been iteratively analyzed (Table 4) initially using the first level of codes from Fig. 4 governance and food supply, and the second level codes municipality-security and supply-product for the eight food groups dairy, fish, meat, poultry, honey, alcoholic beverage, wine and oils.In the municipality-security code three tertiary codes emerged; policy, guardian/perpetrator/victim and action.In the supplyproduct code, four tertiary codes emerged; control, fraud/food safety issue, location/country/country of publication/continent and traceability.Themes that did not fit into these codes were placed in an 'other' category.Guardians are the people at national, supply chain or individual business levels with the knowledge, skills and understanding to implement procedures to prevent food fraud operating in the municipality-security sphere (Spink et al., 2015).Examples include Europol, Civil Guard, Police or Authorities.Hurdles are the control system components that reduce opportunity for food fraud either as a deterrent, formal control or means of detection e.g.tests, audits, product sampling (Spink et al., 2015).With several commodities there was a hurdle gap i.e. no hurdles were iteratively derived from the media reports.

MedISys-FF as early warning tool
Previous studies concluded that the MedISys-FF system is a useful tool that complements other systems such as RASFF, EMA, HorizonScan (Bouzembrak et al., 2018) and is used by the JRC in their activity to coordinate food fraud detection and controls within EU as a means of prevention (Rortais et al., 2021;Ulberth, 2020).Besides showing trends, which may help authorities and food industries to focus their control activities, the tool may also be used to find new problems at an early stage by picking up media discussion anywhere in the world when the food fraud incident occurs.We will demonstrate this with the Fipronil case in eggs.Besides, using network analysis showing words that are mentioned together in connections, new insight may be extracted from these publications.The latter we will demonstrate by COVID-19 as a case.

Use case example COVID-19 and Fipronil
Recently, Brooks et al. ( 2021) suggested there was a potential impact of the COVID-19 pandemic on the frequency of food fraud resulting from an increased demand for food products often associated with increased prices, providing an economic opportunity for fraudsters to gain access for their illicit goods on the market.
To add to this discussion, we analyzed media articles on food fraud in our database that also mentioning COVID-19.The number of COVID-19 related articles collected was 53, starting in February 2020 and peaking in May 2020.To reflect the content being discussed in these articles, a network visualization of the text in the description section (i.e. the expert generated summary of the report) was prepared and the results are shown in Fig. 6.Nine word groups became apparent for COVID-19 as associated with food fraud and food safety but besides meat and illicit alcohol no other specific vulnerable food products were mentioned.Fraud with illicit alcohol as detected by MedISYs-FF was confirmed recently by Manning & Kowalska (Manning & Kowalska, 2021).During COVID-19, misleading publications in the social media appeared that alcohol may prevent/or cure COVID-19 infections and methanol adulterated alcohol was consumed leading to almost 300 people dying in Iran (Soltaninejad, 2020) as one example.Many articles collected by MedISys-FF related to COVID-19 have warned about a potential increased food fraud due to less inspection and testing of governments or due to increased food prices driven by COVID-19, making food fraud more attractive.These findings are in alignment with the expectations mentioned by (Brooks et al., 2021), but interestingly the words in the network visualization (Fig. 6) are linked more to product-supply than municipal-security.Hence, network analysis of COVID-19 related articles could show relations with food fraud products (e.g., alcohol and meat) and could therefore be seen as an signal that would need further attention from the controlling authorities.
Besides these COVID-19-related food fraud publications, retrospective analysis of the articles related to another big food fraud incident,  Fipronil in eggs, showed that the MedISys-FF systems also had picked up these media publications when this crises was apparent (i.e., 17 and 5 articles, respectively in August & September 2017), showing the potential of MedISys-FF as an EWS as suggested earlier (Bouzembrak et al., 2018;Rortais et al., 2021).This would only be of value within an EWS where the search terms were already known.The value as an EWS is reduced for novel or emergent food fraud issues where terms may not be identified in the search process.
Being able to translate media into the language used in the search is also important to identify issues early in the timescale of the incident.

MedISys-FF limitations
The European Media Monitor (EMM) infrastructure on which MedISys-FF is constructed collects publications from >6000 locations/ websites ranging from official websites such as authorities to newspapers and blogs 1 .Food fraud incidents discussed in newspapers and blogs may not fully reflect the reality of the market.However, a comparison between official food fraud reports in RASFF, EMA and Horizonscan revealed great similarity between MedISys-FF and these databases although also some differences were observed (Bouzembrak et al., 2018).
The keywords used by MedISys-FF to find media reports on food fraud were carefully selected base on scientific literature and expert consultation (Bouzembrak et al., 2018).The keywords include known fraud type linked to a product (i.e., counterfeit coffee) but also generic terms such as intentional substitution of food, fake food, etc. (Bouzembrak et al., 2018).Nevertheless, it is clear that the design of the filter will determine the type of publication retrieved and future improvements are apparent.Interesting in this regard, the filter picked up COVID-19 related food fraud, without COVID-19 as part of the keyword setting.
It is apparent that the highest number of food fraud reports collected from the media came from Egypt.This was also the case at the development of the filer (Bouzembrak et al., 2018), and has continued the following 5 years.The reasons for this is unclear.It may be that food fraud is a big concern in Egypt and that incidents are therefore more reported.However, it may also be due to a bias of the filter which has not yet been determined.Further investigation is needed to clarify the reasons.

Conclusions
The food fraud filter in the European Media Monitor infrastructure, MedISys-FF, was assessed over a 6-year period (2015-2020) and it is concluded that the accuracy of finding relevant food fraud articles has remained similar compared to the accuracy when it was launched in 2015.However, the range of countries from which reports are found and the number of languages has increased considerably, adding complexity to the analysis.Many adulterated food commodities were reported and are mainly associated with problems with the expiry date and tampering.Many trends were observed which differ by country and region.Network analysis of words in the collected articles provides a good overview of the issues related to the fraud cases and allowed for iterative analysis and three level coding of the words used in the media reports.This showed that mentions of hurdles for food fraud was limited and excluded for some food categories.
These findings demonstrate that MedISys-FF can be useful for regulators and the food industry to detect trends and potentially food fraud incidents occurring anywhere in the world and thereby may supports timely measures to articulate, analyze and amplify communication around food fraud incidents.
This article sheds some light on some important developments of the global food fraud early warning system MedISys-FF, but it also raises several questions to address in future research.MedISys-FF is based mainly on global media records, enriching the database with other data sources, food fraud expert judgements will add value to the food fraud knowledge.Moreover, MedISys-FF data will be used to predict food fraud in the global level using systems approach and machine learning.

Declaration of competing interest
Nothing declared.

Fig. 2 .
Fig. 2. Frequency of the type of food fraud being reported in the dataset (2015-2020).CED: common entry document; HC: health certificate.

Fig. 3 .
Fig. 3. Global frequency of food fraud in fruit and vegetable by continent as reported in the dataset (2015-2020).

Fig. 4 .
Fig. 4. Network visualization of all relevant food fraud articles collected worldwide collected by MedISys-FF in the period 2015-2020.

Fig. 5 .
Fig. 5. Network visualization of all relevant food fraud articles considering alcoholic beverages collected by MedISys-FF in the period 2015-2020.

Table 2
Overview of products mentioned in food fraud articles published world-wide (2015-2020).

Table 3
Examples of food fraud that lead to food safety risks.
Product CasesFish USA: "Restaurant in Florida sold fake tuna that in reality was escolar, a cheaper fish the FDA says can trigger food poisoning symptoms.A women was hospitalized.A further investigation found dozens of restaurants doing the same thing (selling tuna as escolar)".2017-11-06.Clams Spain and Portugal: "Authorities in Spain and Portugal have uncovered what they are calling a criminal network involving contaminated clams that sickened up to 30 people".2019-12-21.Alcohol Malaysia: "The recent incident where 19 people died due to alcohol poisoning is a national disaster.Fifty others are hospitalized with some of them being in serious condition".2018-09-20.Russia: "55 people have been killed in Irkutsk region, Russia, with almost 26 others still lying in hospital bed without a chance of survival after they drank bath lotion or bath oil, masquerading as a safe alcoholic drink called Hawthorn".2016-12-22.