Assessing gastronomic tourism using machine learning approach: The case of google review

This study aims to evaluate tourists' reviews of gastronomy tourism expressed in Google reviews according to the CAC model (Cognitive, Affective, and Conative), and to examine the inter-cor-relations between CAC model components. The study was applied to traditional restaurants in Amman downtown. The research then extracts the main themes from the textual reviews as well as a sentiment score of an affective image of traditional Amman downtown restaurants. The results of machine learning experiments suggest that the proposed approach can identify traditional restaurant reviews in Amman downtown into CAC model components. The results also show that the Random Forest algorithm performed best in the cognitive and cognitive dimensions, whereas the Neural Network algorithm performed best in the affective dimension. ML classifier revealed that most of the reviews were classified as cognitive (such as the type of food, and services) while the remaining reviews were classified as affective (such as pleasure and arousal) and conative (such as intention to recommend, and positive word of mouth) respectively. The highest probability of the cognitive components was the traditional food topic reflecting the unique image of Jordanian traditional food. Affective images formed by users were mainly positive emotions, indicating that the destination image spread well.


Gastronomy tourism
Making food has become a vital part of the nation's brand image and is an integral part of the history and character of many places.Gastronomy tourism offers a potential to revitalize and diversify the travel industry, spur regional economic expansion, include numerous professional sectors, and introduce fresh uses to the primary sector.As a result, culinary tourism aids in the promotion and branding of destinations while also preserving and protecting regional diversity and customs and utilizing and identifying authenticity.
The UNWTO Committee on Tourism and Competitiveness (CTC) defines gastronomy tourism as a type of tourist activity distinguished by the visitor's encounter with food and related items and activities while traveling.In addition to authentic, traditional, and/or creative culinary experiences, gastronomy tourism may also involve associated activities like visiting local producers, taking part in food festivals, and taking cooking classes (World Tourism Organization website).Zelinsky described gastronomy tourism as traditional travel to a particular location for consumption only at indigenous or local restaurants in 1985, according to Zain et al (2018).However, they expanded the term and idea of gastronomy tourism as a purposeful, experimental involvement in foodways from an anthropological perspective.Foodways are the points where culture, tradition, and history meet with food (Rachão et al., 2019).
Many studies have demonstrated that the interest in food may strongly impact the motivation to travel to a destination (Dixit, 2019;Folgado-Fernández et al., 2017).According to Henderson (2009), spending on food and beverages while traveling contributes to around 25% of the overall travel cost.Food therefore significantly affects the tourism industry.Even while some tourists might be curious to try the local cuisine, participating in food-related events is not their top priority.Foodrelated activities such as traditional restaurants and food festivals, on the other hand, particularly stimulate gourmet tourists and enhance the image of gastronomy tourism.(Hall & Sharples, 2004, Wang, 2015) Consequently, several Destination Marketing Organizations (DMOs) have begun to promote local cuisine as a destination attraction (Dixit, 2019).Moreover, Hsu and Scott's (2020) research found that pleasant culinary experiences enhance the brand of a destination.A destination's brand is an identity that represents the distinctness and attractiveness of a location in the tourist environment, and much research indicates that destination image is an important aspect when marketing a destination.As a result of intense rivalry among destinations, DMOs have been compelled to adopt distinct techniques that generate unique experiences.Furthermore, destinations should emphasize their distinctive qualities to be chosen as a preferred vacation location in order to develop a successful destination image and achieve sustainability with competitive advantages (Sio et al., 2021).

Destination image and User Generated Content (UGC)
Virtually all aspects of human activity depend on opinions and experiences, which also significantly influence how we behave.How other people view and interpret the world has a significant impact on our ideas, our perceptions of reality, as well as the decisions we make.So, when we need to plan, we frequently ask for other people's opinions (Serna et al., 2015).Understanding key aspects of as well as flaws in the gastronomy tourism experience can be achieved by analyzing user-generated content (UGC), such as online reviews or "Google reviews," which can give tourism planners, business owners, destination marketing organizations, and the government crucial insights into how to enhance the quality of tourism goods and services in these tourist destinations (Liu et al., 2022).
Early literature has shown that visitors prefer UGC for instance "Google reviews" over official sources since such online interactions are more driven by personal experience and less concerned with prospective business advantages (Pan & Li. 2011).This provides tourists with a more complete picture of the destination and its services.UGCs and social networks have become great sources of information regarding the image of a destination.Finding out how a destination is seen, its goals, and underlying discourses can be done using a process called web discourse analysis.According to several studies, content analysis can aid individuals in understanding how locations, goods, and enterprises are positioned.(Sun et al., 2021;Serna et al., 2015).
Despite the significance of tourism destination image (Martins, 2015), no widely acknowledged framework of destination image exists (Xu et al., 2018), however, the CAC model (cognitive, affective, and conative) has been the main focus of the majority of studies on destination image (Serna et al. 2015;Agapito et al. 2013;Xu et al., 2018;and Sio et al., 2021).Since its introduction in 1994, the Gartner Cognitive-Affective-Conative Model has been widely utilized to evaluate destination images that show a person's awareness of a destination (cognitive), feelings toward it (affective), and actions and reactions to information (conative) (Sio et.al, 2021;Liu et al., 2022;Xu et al., 2018).In terms of destination image construction and postvisit destination image, it is a hotly contested paradigm in tourism literature.Customers can use the CAC model dimensions to create an overall global image that is more than the sum of its individual components when making decisions about travel (Liu et al., 2022).Affective images are ethereal and have more emotional components than cognitive images, which are physical (Michael et al., 2017).Affective destination images are a result of cognitive destination images since the evaluation response depends on the knowledge of the things the customers.As a result, their cognition might impact their positive or negative attitudes toward particular tourist spots.Customers' intentions to return to a location might be influenced by positive affective images of that destination.Pleasure, arousal, relaxation, excitement, and favorability are important components of affective images (Liu et al., 2022).The perceived value of a destination also influences a tourist's affective destination image, and both cognitive and affective destination images affect tourists' behaviors, according to Tosun et al. (2015).

Traditional food in Amman downtown
Amman's diverse population and architectural styles reflect the city's complicated history.The population of the city is multicultural and multireligious.A picture of the region's past may be seen in the vibrant souks, Roman ruins, cultural museums, and monuments that can be found even in the ultra-modern business zone.(TripAdvisor).As a result, 85 percent of international visitors to Jordan come for culture and historical tourism.Petra, Wadi Rum, the Dead Sea, and Amman are the key locations for this activity (MOTA, 2021).Amman's old center, known as "Al-Balad", is a popular tourist site in Jordan.It is notable for its rugged terrain and is situated in the valley of the old Amman River (Seil Amman).Furthermore, Amman's city features a distinctive environment of hills and valleys linked by stairs and curvilinear pathways.It is now packed with stores, traditional restaurants, and historical sites.Amman's downtown is divided into seven neighborhoods, totaling 1,840 dunams.It was considered the heart of Amman's business district until the late 1970s or early 1980s (Jamhawi et al., 2015).Also, Amman's richness and diversity, notably its downtown, drew around 613,201 tourists in 2019, making it Jordan's most visited city (MOTA website).
Although downtown Amman was among the main destinations for cultural and historical tourism, especially gastronomy tourism, in Jordan.There was little research in the field of cultural and historical tourism, especially gastronomy tourism.This paper is meant to bridge this gap as a first step.Additionally, this paper's goal is to present an additional assessment strategy that can be applied: 1) to evaluate tourists' emotions toward Gastronomy tourism "traditional restaurants" expressed in Google reviews, 2) to discover sub-themes of the CAC model spread out through tourists' reviews in Google reviews, and 3) to examine the inter-correlation between CAC model components.To accomplish the aims, the study compares three supervisor machine learning models and selects the best performance of an ML classifier.The research then extracts the main themes from the textual reviews as well as a sentiment score of an affective image of traditional Amman downtown restaurants.The research demonstrates how Google reviews may be used to uncover CAC Model components and assess culinary tourism images.

Methodology
The major goal of this study is to determine how Google reviews affect the cognitive, affective, and cognitive aspects of TDI.
To achieve this, the study identifies an ideal method for analyzing Google Maps reviews see Fig. 1.This research consists of four distinct primary phases: (1) Data Collection, (2) Data Preprocessing, (3) Ground Truth Data, (4) Data Vectorization, (5) Supervisor ML models, and ( 6) Sentiment polarity analysis by Orange (3.29).

Data Collection
To determine the eligible traditional restaurants, firstly, a Google map search was conducted using relevant keywords such as traditional restaurants, traditional cafes, local cafes, central Amman restaurants, and heritage restaurants.We got a total of 18 restaurants and cafes, with 31,733 reviews.Secondly, restaurants and cafes whose description is not traditional or downtown Amman were excluded, as well as cafes or restaurants with less than seven reviews or non-English reviews.As a result, 3430 reviews of 18 eligible restaurants were collected using the Webharvy tool.Table 1 shows all restaurants and cafes included in the study and the corresponding number of reviews, ratings, and types.

Data Preprocessing
Next, the researchers set about data pre-processing and filtering.The study applied transformation i.e. converting characters to lowercase, and removing URLs, also it implements tokenization and normalization.Moreover, the researchers removed punctuation, special characters, and extra whitespaces, all this was performed by Orange 3.29 data mining software.After this preprocess, the total number of reviews was reduced to 3064 reviews.

Ground Truth Data
Ground truth data is "data collected at scale from real-world scenarios to train algorithms on contextual information such as verbal speech, natural language text, human gestures and behaviors, and spatial orientation" (Q analysis, 2021).Based on this, the researchers prepared a ground truth dataset by annotating the reviews as either Cognitive, Affective, or Conative.The ground truth dataset consists of 1000 documents "reviews" and was automatically coded by the topic modeling widget within orange (3.29 software), and it was labeled as "1" to include TDI dimensions or "0" not include TDI dimensions.Also, the auto-code results were sent to an "Amazon SageMaker Ground Truth Text Classifier" to sort the reviews into predefined labels by topic modeling widget.Furthermore, Amazon SageMaker did not show significantly different classifications.The inter-rater ratio between the two classifications was 89.6%.The criteria in Table 2 were applied to automatically annotate reviews.Table 3 presents the number of reviews per thematic analysis after annotation in training data.

Data Vectorization:
In the next phase, a Bag of Words (BOW) method was employed to extract unique words from the text corpus and vectorize each document (i.e., each review).Unigrams and routed documents were extracted using the Document Inverse Frequency (IDF) that considers both frequency and importance of concepts or words.

Supervisor ML models
Machine learning (ML) models were developed to categorize customer reviews into three TDI dimensions and sub-dimensions as our goal is to identify the components of TDI (Cognitive, Affective, and Conative) in Google reviews of traditional restaurants.Our method is consistent with existing research that defines TDI as "TDI is a multidimensional overall impression formed by distinctly different but interrelated components, namely cognitive, affective, and conative" (Sanz, et al. 2016).
To achieve a higher degree of credibility, this paper has applied four different models/classifiers using supervised machine learning algorithms which are used to solve text classification problems.It is also used to explore the experiences of tourists in the restaurants and cafes of downtown Amman; namely (1) Neural Network based on a multi-layer perceptron (MLP) algorithm with backpropagation.
(2) The logistic regression classification algorithm with LASSO (L1) or ridge (L2) regularization (3) The kNN algorithm that searches for k closest training examples in feature space and uses their average as a prediction.(Orange 3, 2021) (4) Random Forest is an ensemble learning method used for classification (Breiman, 2001).
Also, the researchers applied Under sampling techniques which remove examples from the training dataset that belong to the majority class to better balance the class distribution, such as reducing the skew from a 1:100 to a 1:10, 1:2, or even a 1:1 class distribution.(Brownlee, 2020).Under-sampling techniques applied by imbalanced-learn v.080 program codes (Guillaume et al., 2017) to deal with the imbalance in the training dataset.Table 4 indices the number of reviews for each TDI dimension before and after balancing the training set.Hence, the chance of binary rating is 50% because the training dataset is balanced.

Sentiments analysis
Sentiment analysis makes predictions about each document in a corpus.It makes use of the Data Science Lab's multilingual sentiment lexicons as well as the NLTK's Liu & Hu and Vader sentiment modules.They are all lexicon-based.Multilingual sentiment supports several languages.Because Vader works only in English, this paper used Vader sentiment since all reviews are in the English language.(Hutto & Gilbert, 2014).

Results and discussion
To evaluate and validate ML classifiers that predict TDI dimensions (cognitive, affective, and cognitive), the study uses the ROC curve along with ML model measures to achieve this.

Validity of classifiers of TDI components
The Receiver Operating Characteristic curve (ROC curve) is a plot with a True Positive Rate (TPR) or sensitivity on the yaxis, and a False Positive Rate (FPR) or 1-specificity on the x-axis, as shown in Fig. 2. Also in the ROC curve, best-performing classifiers are those whose curves are closer to the upper left corner.But to compare different classifiers or measures of predictive accuracy, it can be useful to summarize the performance of each classifier into a single measure which is an area under the ROC curve, as shown in Fig. 2 the Random Forest and Neural Network classifiers provide the best space under the ROC compared to other classifiers and they excellent classifiers according to Mandrekar (2010).

TDI classification
After ensuring the validity of each classifier, the researchers applied the best-performing ML classifier that fits TDI dimensions (that is, Random Forest to predict cognitive and cognitive themes) to classify the 2042 reviews that were not labeled.
Based on the prediction results and ground truth dataset, 77.8% of the reviews were classified as cognitive, while 15.1% and 0.07% of the reviews were classified as affective, and cognitive respectively.Further, table (6) shows the most important words or reviews that have the highest probability of belonging to the theme.For instance, the type of food, and services and the place had the highest probabilities to predict cognitive theme while in affective theme pleasure and arousal have the highest probabilities to predict it, finally Intention to recommend, and Positive word of mouth were the highest probabilities of predicting conatively.

Cognitive image
The researchers employed Topic modeling with "Latent Dirichlet Allocation" to find abstract topics in a corpus based on clusters of terms discovered in each text and their corresponding frequency to examine cognitive themes in more detail.The aspects inside the cognitive image of food were also displayed in this study using multidimensional scaling (MDS), as indicated in Table 7 and Fig. 3 below.As shown in Table 7, the Topic modeling widget cluster cognitive food image into 5 main topics, which is, traditional food, Amman atmosphere, and Rest.Service, traditional place, pricing, according to marginal topic probability.Seeking to shed light on the "marginal topic probability" cognitive components, on the one hand, the highest probability of a topic goes to the traditional food topic (21.1%), and this high probability comes from different words, such as Jordanian food, falafel, hummus, Shisha.These words have exceeded 15% in frequent weight reflecting the unique image of Jordanian traditional food.
In addition, the topics of catering and pricing come second in the image of cognitive food with a probability (20%) for each topic.From the perspective of the catering with the highest attention, many catering terms appeared frequently and the highest terms weight were Friendly (28.9%),Great Service (22.8%), and Fast Service (19.6%).In the price topic, the words like "Low price" (29.3%), "Expensive" (21.4%), "cheap" (17.6%), and "average price" (15.4%) have the most importance, which means that pricing policy is an important topic for to create cognitive food image from tourists' perspective.
As shown in Table 7, the food flavor and catering presented in the tourists 'reviews were mainly, besides, restaurant atmosphere and traditional places such as "Atmosphere" (22.5%), "Old City" (18.2%), "Crowded place" (17.8%) and "Authentic rest."(16%).Tourists' interest in the tradition and history of the restaurant, the location of the restaurant, and the landscape around it are all evidence that tourists care about more than just the authenticity of the food.To find a low-dimensional in the cognitive topic, Multidimensional scaling (MDS) was used.As depicted in Fig 3, three topics related to the cognitive food image appear closely related namely, Traditional food, Catering, and Pricing, in other words, the previous topics represent one dimension, and we call it "food and related service", but the restaurant atmosphere and traditional place themes did not show any significant association together, Thus, each of them has an independent dimension within the cognitive food image.

Affective image
Based on ML model prediction output, we use the sentiment analysis widget in Orange 3 software to analyze the emotional attitude of tourists' reviews.As reported in Table 8, positive reviews get the highest proportion (76%) while negative comments get the lowest proportion (6.1%).Users' comments on Google Maps about restaurants generally evoked mostly pleasant feelings, demonstrating that the destination's reputation spread well.

Table 8
Affective food image subcategories correspond to marginal topic probability.To perform sentiment analysis, the researchers used Orange3 Text Mining software to apply lexicon-based approaches with the Vader module which included a positive score, negative score, neutral score, and compound (combined score).And they used the Heat Map widget to visualize data.Also, Heat Map is an important tool for discovering relevant features in the data.By removing some of the more pronounced features, new information will appear, which was hiding in the background (Hutto and Gilbert, 2014).But before using a heat map, you should pre-process the corpus to delete meaningless modal particles, stop words, etc.
As shown in Fig ( 4), the Heat map uses k-means to merge restaurant comments on Google Maps with the same polarity into one line.Then it used Cluster by rows to create a clustered visualization where similar comments are grouped.Furthermore, figure (4) provides the color scheme legend.Low and High are thresholds for the color palette ("blue" for negative emotion with -62.5% and "red" for positive emotion with 95.8%).Also, Heat Map contains diverging palettes, which have two extreme colors and a neutral color (white) at the midpoint, to set the meaningful midpoint value (the default value is 0), here is the neutral emotion.
It can be seen from below Fig. 4 that the high-weight words of positive emotions are mainly focused on two elements including "Traditional place", and "traditional food": first, the perfect place to enjoy a lovely meal and chat with friends; second, there are many restaurants in Amman downtown introduce traditional food and the tourists love it.
In contrast, the words of great importance for negative emotions are mainly concentrated in two components, including "traditional food", and "negative word of mouth": first, Humus was nasty; second, it's not like it used to be, Low class.9), intention to recommend topic has the highest probability in cognitive food image with 58.5%, and words like "recommended", "highly recommended" or "must go" have the highest weight in a recommended topic.An example from reviews for recommended topics is "Jafra one of the best places you could be in, highly recommend".
In line with this framework, the conative food image includes re-visit attributes referring to visit, downtown Amman, and location, and these were keywords in tourist reviews about intent to re-visit based on word weights (12.1% to 18.3%).
It is also noted from Table ( 9) that, on the topic of "positive word of mouth", words such as "Experience" (14.1%) "Great" (12.9%) "Tasty" (11.8%) tell other tourists how good your products or services are, and it works.Unfortunately, the Multidimensional Scaling (MDS) test couldn't find a higher dimension in the conative food image (Fig 5).Therefore, the three subcategories explored from topic modeling still have the higher dimensions of conative food image namely, intention to revisit, positive word of mouth, and intention to recommend.

Conclusion and implication
The findings of this study are provided in two parts, with the first concentrating on Machine learning (ML) models that were constructed to classify tourist evaluations into the three aspects of the CAC model.At this phase, Machine learning (ML) models are tested using ground truth data based on the destination image component (Cognitive, Affective, and Conative).At the second level, the classification of CAC model dimensions into sub-dimensions using topic modeling known as "Latent Dirichlet Allocation" to uncover abstract topics in a corpus.
The results of machine learning experiments suggest that the proposed approach can identify traditional restaurant reviews in Amman downtown into CAC model components.The results also show that the Random Forest Algorithm performed best in the cognitive and conative dimensions, whereas the Neural Network Algorithm performed best in the affective dimension.
The results imply that different destination image components can be created from the picture properties of Google reviewers.While travelers pay the most attention to the cognitive aspect of a location, reviewers evaluate a few destination features.
Although the most common cognitive image components among Google reviewers have already been discovered, the significance of the various sub-components appears to vary in this study.Previous studies on how people use social media to find out about travel options online have shown that some terms (like nightlife and restaurants) are more likely to produce more social media search results than others (like attractions).Furthermore, Gretzel argues that online communities have a strong connection to "core" tourism businesses like attractions, activities, and lodging, while consumer review sites like Google Reviews have a strong connection to social networking, hotels, and shopping, and blogs and photo/video sharing sites have a chilly relationship with events, nightlife, and parks (Kladou & Mavragani, 2015).According to the analysis of Google reviews on Amman's historical restaurants, the cognitive components of traditional food, restaurant setting, catering, traditional place, and pricing are deemed significant enough by reviewers to be essential.The component with the most referrals was traditional cuisine qualities.The reviewers' concentration on such characteristics was perhaps justified by the fact that they were reviewing the attraction labeled "traditional food in Amman downtown." In companies marketing literature, affective linkages exhibited through emotional judgments are referred to as attitudes toward commodities.Additionally, the various attitudes that the customer creates toward the characteristics of the product are compensatory, so a negative attitude toward one feature may be offset by positive feelings about others, and vice versa (Kladou & Mavragani, 2015).By balancing these attitude combinations, a customer creates an overall attitude toward a product (Leisen, 2001).Similarly, a particular tourist destination may have natural attractions, traditional places, traditional food, and other aspects (San Martin & Rodriguez del Bosque, 2008).Considering that the overall attitude toward a destination is determined by the 'balanced' outcome of a perceived experience and the considered relevance of the destination attributes, it is reasonable to assume that online reviewers are generally favorable about their visit to Amman downtown.Moreover, gastronomy serves as a significant incentive for travelers to experience cultural heritage by transforming culinary delights into deciding factors for destination selection and, at the same time, serving as a major driver of traveler happiness (Mora et al., 2021).
This work is largely noteworthy from an academic standpoint in terms of methodology.Social media sources should be carefully used in data collection and analysis for research, according to Zeng and Gerritsen (2014).Despite the significance of understanding tourists' perceptions of a location and the growing significance of online information sources and social media, research analyzing destination pictures in an online context is scarce.(Kladou & Mavragani, 2015).As a result, this study applies a previously established destination image framework by reading and categorizing genuine Google reviewers.
The present study contributes to the body of knowledge by examining the three image components included in Google reviews by individuals who choose to share their opinions with potential tourists, and it emphasizes the critical significance of three destination photographs.The main goal of this investigation was to focus on the cognitive, emotive, and conative image elements as they manifest in Google reviews.The study also sought to provide qualitative data that would be helpful to academics and practitioners alike, as well as critically critique the reviews.
Previous research has emphasized the value of capturing the "niche" image held by only a small number of travelers, particularly in the context of internet marketing (Pan & Li, 2011).The variety of comments on Google reviews, on the other hand, underlines the relevance of more general destination food and the overall restaurant environment.Tourists, for example, may remark more on cognitive elements than on other features, but their opinions encompass a wide range of traits (e.g., pricing, traditional food, and atmosphere).Thus, destinations are assessed in terms of "gastronomy," and a favorable attitude toward a destination appears to be linked to more than one feature (Hanna et al., 2020).
The findings of this study have significant implications for marketing strategies considering travelers' perspectives, impact how local cuisine is seen in locations, and pinpoint the factors that motivate visitors to return to downtown Amman.This is one of the earliest empirical studies into the CAC model of tourists' perceptions of local cuisine in Amman, Jordan.It also contributes to the limited empirical research on the image of gastronomy and its application to traditional places.Following that, it was determined that the constituent aspects of a conative gastronomy image were "intention to revisit," "positive word of mouth," and "intention to recommend."Finally, most research, according to Choe and Kim (2018), has concentrated on potential visitors' views of the destination image during the pre-travel stage.According to Ryu and Jang (2006), exceptional culinary experiences in a tourist site can boost the national cuisine image, raise visitor satisfaction, and motivate tourists to return.The current study's theoretical and empirical findings confirm that attitudes toward local food influence the destination food image which is consistent with previous research (Choe & Kim's, 2018).In other words, in keeping with the findings of earlier studies (Amman downtown as a culinary destination), this research demonstrates that delicious local cuisine from Jordan might impact how visitors perceive the area.In other words, as it serves as a vital symbol of a tourist destination, food may be an essential topic in marketing a nation's tourist attractions (Seo et al., 2017).

Limitations and Future Research
Whereas this study adds to the literature on gastronomic tourism in downtown Amman, it does have certain shortcomings that may be addressed in future studies.First, this study relied solely on Google reviews, ignoring those from other travel sites such as TripAdvisor and Booking, which should be incorporated into future research.Second, the UGC gathered from Google reviews did not contain demographic information or gourmet tourist categories.Third, this study solely gathered online evaluations from overseas tourists; Jordanian tourists' online reviews were not included in the analysis.The time of the research stands out as one of the research's shortcomings.To overcome any temporal biases, the study would need to be extended to all months of the year.As a result, an in-depth examination of Jordan's gourmet offer aimed at local and foreign tourists is proposed as a future path of research.
In conclusion, employing domain-oriented text mining algorithms to analyze user-generated data creates a new way to understand the destination as people talk and view it.Future research will also incorporate more sources and tongues to better understand their relationship to the perceived destination image.

Table 1
Restaurants and cafes within central Amman, the number of reviews, ratings, and types

Table 2
Criteria for Annotating Reviews Based on Customers Reviews in Ground Truth Data

Table 3
Thematic Analysis and the Corresponding Number of Reviews in Ground Truth Data

Table 4
Number of reviews for TDI dimensions before and after balancing the training set.
Table5demonstrates that all four machine learning classifiers outperformed the 50% chance baseline.The random forest algorithm achieved the best performance for Cognitive and Conative dimensions according to the area under the table (AUC) with (85.1%, and 87.3%) respectively and F1 scores (of 0.821, 0.923), but the Neural Network algorithm achieved the best performance in Affective dimension.accordingtoAUC (88.5%) and F1 score (0.870).Additionally, Table5displays the breakdown of each ML algorithm predictor's overall performance.Random forest and Neural Network algorithms achieved a high precision (0.827-0.923) and recall of (0.818-0.923) for all TDI dimensions.Thus, the Random Forest and Neural Network algorithms were able to correctly predict the themes predictors of the reviews with a very low error rate.

Table 5
Performance of Classifiers ML algorithm predictors

Table 6
TDI classification based on best performance classifier

Table 7
Cognitive food image subcategories corresponding to a marginal topic probability

Table 9
Conative food image subcategories corresponding to marginal topic probability