An explainable ensemble of multi-view deep learning model for fake review detection

Online reviews signiﬁcantly impact consumers who are purchasing or seeking services via the Internet. Businesses and review platforms need to manage these online reviews to avoid misleading customers through fake ones. This necessitates developing intelligent solutions to detect these fake reviews and prevent their negative impact on businesses and customers. Therefore, many fake review detection models have been proposed to help distinguish fake reviews from genuine ones. However, these techniques depend on a limited perspective of features, mainly review content, to detect fake reviews, leading to poor performance in discovering the new patterns of fake review content and the dynamic behaviour of spammers. Therefore, there is still a need to develop new solutions to detect the new patterns of fake reviews. Hence, this paper proposes an explainable multi-view deep learning model to identify fake reviews based on different feature perspectives and classiﬁers. The proposed model can extract essential features from different perspectives, including review content, reviewer data, and product description. Moreover, we employ an ensemble approach that combines three popular deep learning algorithms: Bi-LSTM, CNN, and DNN, to enhance the performance of the fake review detection model. The results of two real-life datasets presented demonstrated the efﬁciency of our proposed model, where it outper-formed the state-of-the-art methods with improvements ranging from 1% to 7% in terms of the AUC metric. To provide visibility into the outcomes of our proposed model and demonstrate the trust and transparency in the obtained results, we also offer a comprehensive explanation for our model results using Shapely Additive Explanations (SHAP) method and attention techniques. The experimental results prove that our proposed model can provide reasonable explanations that help users understand why speciﬁc reviews are classiﬁed as fake. (cid:1) 2023 The Author(s). Published by Elsevier B


Introduction
Online reviews are becoming increasingly significant for product producers and customers.In e-commerce, positive or negative online reviews significantly affect customers purchasing decisions and business development (Anderson andMagruder 2012, Luca 2016).These positive and negative reviews also give scammers great incentives to scam the system.According to an analysis using the ReviewMeta platform, 1 Washington Post reported that 50% of the 32,435 reviews of the top 10 Amazon Bluetooth headphones were faulty.Such false reviews, i.e. fake reviews, may damage the entire review system and ultimately cause consumer trust loss (Kauffmann et al., 2020).Based on the information from the Bazaarvoice website, 50% of consumers stop shopping and lose their trust in products after they find fake reviews posted on the products.Therefore, detecting these fake reviews becomes a significant issue, arousing considerable research attention (Kauffmann et al., 2020).
Many fake review detection methods have been proposed during the last ten years.Most of these existing models rely highly on the machine and deep learning-based algorithms, which have achieved excellent results in natural language processing (Garcia & Berton, 2021;Malla & Alphonse, 2021).Compared to traditional machine learning approaches, deep learning techniques effectively extract latent data representations, which may be beneficial to improving fake review detection models' performance.Deep learning algorithms for fake review detection are widely used in the literature due to their significant ability to capture semantic meaning automatically (Zhang et al., 2018).However, the key limitations of existing models are that they have used linguistic features, or reviewer's behaviour (Lim et al., 2010;Mukherjee et al., 2013) and ignored product-level reviewer information on learning the semantic review representation, and the relationship between reviewer data, review content, and product description.Also, they have highly depended on one classifier to distinguish fake reviews, which reduces the detection model's robustness and reliability against reviewers' dynamic behaviour (Jindal & Liu, 2007;Mohawesh et al., 2020).In addition, existing detection models are a black box, and they cannot explain the logic behind their decision about reviews (fake or not), lacking transparency and trust in the outcomes of these models (Zhang et al., 2018), (Albahri et al., 2023).Therefore, considering multiple-view features and classifiers would be a better solution for detecting fake reviews.Also, it is essential to adopt a responsible approach to machine and deep learning-based detection models using explainability techniques to overcome the limitation of trust and transparency in the models' decision process (Albahri et al., 2023).Further, using machine learning-or deep learning-based models can enhance the decision-making process for customers and businesses.However, determining confidence in individual predictions of these models is a significant problem.For example, machine learning models cannot be trusted in sensitive research areas like medical, defence, finance due to the level of complexity.Consequently, there is a need to explain machine learning models to make them more reliable and responsible.Thus, both customers and businesses can have more visibility into how a machine and deep learning-based fake review detection models found their final output and they therefore can trust the models and their decision.Therefore, in this work, we provide an explainable fake review detection model.This is the focus of this research which is the first research that presents an explainable fake review detection model.
The contributions of this paper can be summarised as follows: 1. We propose a multi-view feature approach that combines the implicit and explicit features of review content, reviewer data and product description.2. We propose a hybrid features extraction method to extract the implicit and explicit features from review content.This hybrid method combines the word-and sentence-level feature extraction methods with attention.3. We propose an ensemble classifier that combines three popular deep learning algorithms based on different feature perspectives.We use a convolutional neural network (CNN) for dealing with reviewer information, a deep neural network (DNN) for product-level, and Bidirectional-Long Short-Term Memory (Bi-LSTM) for review-level.4. We use two explainable techniques, the SHAP method and attention mechanism, to interpret the model and understand its behaviour.To the best of our knowledge, our model is the first explainable fake review detection model.5. We tested our model using two public real-world datasets to prove the effectiveness of our proposed model in detecting fake reviews.
The paper is structured as follows: Section 2 illustrates the literature survey for detecting fake reviews.Section 3 provides Preliminaries of the proposed deep learning models.Section 4 describes feature engineering and methodology in detail.Section 5 presents the ensemble model.Section 6 provides experimental settings, detailed descriptions of the real-world datasets and dataset pre-processing.Section 7 presents the experimental results and discussion.Section 8 describes the interpretability of our model.Section 9 presents the conclusion and provides context for future work.

Literature survey
This section reviews the existing detection models based on their features and explains the algorithms used.

Fake review detection based on textual-based featuring
Many researchers have used textual content or/and language characteristics to detect fake reviews and improve detection models' performance.These features include term frequencies, word meanings, used terms, and language patterns (Mohawesh et al., 2020).Bag-of-words (BOW), Stylometric, term frequency, semantic, n-gram, POS, sentiment, and LIWC have commonly used approaches for feature extraction.Asghar (2016) used four contextual features, latent semantic indexing, trigrams, bigrams, and unigrams, to identify fake reviews.The author also used 16 machine learning methods, of which logistic regression achieved the highest result of 64% accuracy.Shahariar et al., (2019) used word embedding, term frequency (TF) and n-gram to detect fake reviews.Their method using Long Short-Term Memory (LSTM) achieved 94.56% accuracy Deceptive Spam Corpus created by (Ott et al., 2011).This showed that deep learning is a good approach for fake review detection.Li et al., (2015) used deep learning to learn document representation to distinguish fake reviews from genuine ones.The proposed model demonstrates the effectiveness of using a convolutional neural network (CNN) in multi-domains on the Deceptive Spam Corpus.Further, in the combined domain, CNN outperformed LSTM.Saumya and Singh (2018) introduced a model using review sentiment with other features for classifying fake reviews.They used the Random Forest (RF) algorithm, which achieved 91% of the F1 measure.The Doc2vec method was used by Yilmaz and Durahim (2018) to produce document embedding from the review text content.They addressed the issue of variable-length reviews by producing a fixed-length embedding vector for each review and feeding it to the classifiers.However, the proposed model was only good for the long text.Cao et al., (2020) introduced a fake reviews detection model using features from review text content.Their model using Latent Dirichlet Allocation (LDA) features with the Text CNN model obtained the highest accuracy on the Deceptive Spam Corpus.
Unfortunately, fake review detection using linguistic features is frequently misled by sophisticated spammers who try to produce opinions like real ones.Textual characteristics are also domainspecific, which causes a challenge in establishing a unified crossdomain detection method.For example, the words used to describe a restaurant's foods, like '' tasty " or '' delicious," cannot be used at automobile repair businesses.Also, detecting fake reviews based on textual content alone has low accuracy (Cardoso et al., 2018), (Patel and Patel 2018), (Pasi et al., 2019), (Fontanarava et al., 2017), (Heydari et al., 2015), (Crawford et al., 2015).

Fake review detection based on behaviour-based featuring
In previous studies, researchers have considered the behavioural features that are most frequently used, such as rating, early reviews, reviewer ID, and review content similarity alone (Jain et al., 2019).For example, Mukherjee et al., (2013) proposed a clustering methodology by analysing spammers' behavioural footprints.They assumed that these two groups had various behavioural characteristics, including maximum review number, content similarity, burstiness, early review ratio, etc. Hussain et al., (2020) used text and behavioural features to detect fake reviews.They used 13 behavioural features such as maximum review number, content similarity, burstiness, etc.They also employed these behavioural features to produce a labelled dataset that was examined as fake or genuine.Then, the labelled dataset was used as input for their second model to detect fake reviews.The model results showed that behavioural features achieved a high accuracy of 93%, and the review text achieved 88.5% on the Amazon reviews dataset.Though, the proposed model used a limited number of behavioural features.Although behavioural features decrease computational costs and speed up the process, they cannot be used alone to detect fake reviews.

Fake review detection based on textual, and behaviour-based featuring
Some studies used metadata features and text review content features to identify fake reviews.For instance, Nilizadeh et al., (2019) used textual and metadata features to identify false reviews.The authors used OneReview techniques to determine the location of fake reviews.The attention method proposed by Wang et al., (2017), to identify fake reviews using textual and behavioural features was used to improve classification prediction.The behavioural features were extracted using a multi-layer perceptron, and the linguistic features were extracted using a CNN.Then, (Wang et al., 2017) proposed a feature attention technique to learn a dynamic weight for behavioural features and linguistics.Their proposed model achieved good results on the Deceptive Spam Corpus and achieved poor results on the Yelp dataset.This suggests that combination features can improve model classification performance.
Most of the existing fake review detection models focused on using the review text or user information while ignoring the relationship between reviews text, reviewers and product features that could improve fake review detection models' performance.Therefore, we propose a multi-view fake review detection by considering the features from different perspectives, including review content, users, and products, to improve the detection performance and make the model more robust against new patterns of fake reviews.We also present the first explainable model in fake review detection using the SHAP method and attention mechanism to build trust, transparency, and reliability in the decisions derived by deep learning models.(See Table 1)

The proposed model
Our proposed model consists of multiple views of features and classifiers to identify fake reviews.As described in Fig. 1, it consists of multi-view features that are extracted from different perspectives, including review content, reviewers and spammers, and product description.As a decision engine, we also used multi-view classifiers, including Deep Neural Network (DNN), Bidirectional-Long Short-Term Memory (Bi-LSTM), and Convolutional Neural Network (CNN).Each deal with a different feature perspective, enabling the deep understating of data patterns in each feature type.To provide a better performance, we used an ensemble approach to combine the extracted results and select the best result based on the voting approach.The details of the proposed model will be explained in detail in the following subsections.
However, to justify why using multiple views of features is essential, we provide the following examples from the Yelp Zip dataset (Rayana and Akoglu 2015).We select and analyse several fake reviews from the Yelp Zip dataset shown in Table 2.It can be observed from the table and based on ground truth that there are some abnormal patterns such as ''Great food" and ''Love this place" from user level 32,390 or product level 4431.This review text or sentence can be shared with a legitimate review, and they cannot be used individually as features for detecting fake reviews.However, these features are related to some reviewers, spammers, and specific products.In other words, spammers usually have the same pattern of review content for particular products.Furthermore, these integral features of the product or user level are difficult to show at a single review level, posing enormous problems for fake review detection via semantic review information only.Based on these observations, and since a review is essentially taken from three key points: review text, reviewer, and product, we can visualise a review as a tripartite network.Therefore, analysing the products targeted by spammers is essential when evaluating the fake review detection approaches is essential.To address this issue, we propose a multi-view fake review detection ensemble by considering the features from different perspectives, including review content, reviewers and spammers, and product descriptions.However, extracting features from review content at the word level (e.g., ''great" and ''love"), as is common in the literature review, also has challenges in capturing the implicit and explicit meaning of the review, leading to degrading the performance of the detection model.Therefore, we propose a hybrid model for the review content, using word and sentence attention levels to extract review text features.
In summary, we propose an ensemble classifier model to use these features effectively and prevent the classifier at the decision-making level from bias toward a specific feature.We use different deep learning algorithms where each algorithm works with particular features, and all these algorithms work in parallel and then pass their outcomes to the ensemble model as described in the Figure .On top of that, we also provide an explanation technique to create a responsible fake review detection model.The explanation technique explains the proposed model results and helps users trust and understand why certain reviews are classified as fake.The details of the proposed model are described in the following subsections.

Review Content-based feature extraction and classifier
Most existing studies use a traditional word vector model to extract features from review content (Chaturvedi et al., 2018).Although this method has achieved good results in the model analysis of unambiguous sentences and short sentences (Chaturvedi et al., 2018), processing these sentences, in reality, is a complex and time-consuming process.Therefore, researchers have recently developed new methods to extract these features, such as word embedding or neural network-based methods.These methods have two significant advantages: 1) they extract features automatically and 2) their performance in the text field is generally much better than explicit feature-based methods.Therefore, we use this method in our proposed model.In contrast to previous works (Ren andJi 2017, Dong et al., 2018), our model uses different granular features in order to extract more of the implicit, hidden features in the review text.Our model consists of a hybrid method that extracts features for each review content at the word and sentence levels.Therefore, words with the same representation can be grouped in one cluster, and the reviews (entire text) with the same representation are also grouped in another cluster.Both word-level and sentence-level outputs are concatenated in one model.We use two parallel Bidirectional-Long Short-Term Memory (Bi-LSTM) The proposed model relied on the review text features and ignored the behavioural features (Aghakhani et al., 2018) Textual Feature Amazon Mechanical Turk (AMT) dataset (Li et al., 2014) GAN Proposing semi-supervise Generative Adversarial Network model.
GAN is not efficient to classify text since its stability makes hyper tuning difficult.Utilizing conditional GAN with more hyper tuning could be better.(Ren and Zhang 2016) Textual Features Deceptive Spam Corpus by (Ott et al., 2011) Drug dataset (Gräßer et al., 2018).Four-City (Li et al., 2013).Yelp Zip (Rayana and Akoglu 2015).large movie (Maas et al., 2011) RNN, CNN Gated Recurrent Neural Network (CNN-GRNN) by combining gated neural network and convolutional neural network.
The suggested model used linguistic features only to detect fake reviews.
Integrating behavioural features could enhance the model performance.
The suggested method performed well with short text only.(Wang et al., 2016) Textual and behaviour features Yelp Chi (Mukherjee et al., 2013) Yelp NYC (Rayana and Akoglu 2015) Yelp ZIP (Rayana and  models to extract word and sentence levels separately to obtain the benefits of this hybrid features level.Bi-LSTM is a sequence processing algorithm that contains two LSTMs for taking the input in forwarding and backwards directions, which can effectively capture the past and future text (Liu et al., 2019).
LSTM is a type of Recurrent Neural Network (RNN) architecture and is currently a mainstream RNN structure.LSTM solves the vanishing gradient by exchanging memory blocks with hidden selfconnected units.The memory block is used to learn long rang text sequences.The memory unit tells the network what to learn and what to forget.LSTM consists of four components.First, the input gate manages the extra memory content's size.Second is the forget gate f ð Þ, which specifies a certain amount of the memory that must be erased.The third is the output gate o ð Þ, which modulates the memory content output amount.Fourth, the cell activation vector c contains two components: modulated new memory e c t and forgotten previous memory (c tÀ1 Þ. t identifies the t ð th ) time.The basic architecture of LSTM can only explore historical data.However, a loss of future context can lead to misunderstanding the problem's meaning.Thus, Bi-LSTM is introduced (Graves and Schmidhuber 2005) to catch the past and future context by combining the backward and forward hidden layers.Bi-LSTM works similarly to network forward and backwards passes, except that Bi-LSTM requires unfolding the backward hidden states and the hidden forward states for all time steps.
We provide the mathematical equations of LSTM as shown in Fig. 2.
where f t ; i t ; c t , o t reflect the values of f ; i, c, o at the time t, respectively.The (W) describes the hidden layer updating weights and b represents the bias.The tanh and r ð Þ represent the hyperbolic tangent and sigmoid functions, respectively.
Moreover, to extract the most important features at different levels, the attention technique was used.The details of feature extraction at word and sentence levels are described as follows:  Word Level (Word Encoder with Attention): Given a sentence with words w it t 2 0; T ½ , we first embed the words to vector through an embedding matrix W e ð Þ.Then, we use bidirectional LSTM to get word annotations by summarizing information in both directions.Bi-LSTM contains forward LSTM (defined as LSTM ! ) and backward (defined as LSTM Þ.We obtained the annotation for the given features by combining the forward hidden state h it !and the backward hidden state (h it Þ .These states then summarised the which information and implemented word encoding.Precisely, the word encoder can be calculated as follow: Since not all the words contribute equally to the representation of the sentence, we use an attention mechanism to extract the important features by assigning different weights to words (see Algorithm 1).Specifically, we fed the word annotation h it !to get u it ð Þ using a one-layer perceptron as a hidden representation of h it ! .Then u it is computed as follows: where b ð Þ and W ð Þ present the bias and weight in the neuron, tanh is the hyperbolic tangent function.The model employs a context vector with a similarity between u it ð Þ and a word-level to measure the significance of every word.Then, the sigmoid function was used to obtain the normalized weight a it ð Þ of each word.The (a it Þ was computed as follows: where exp : ð Þ represents the exponential function.At the word level, the vector contains a high-level representation of informative words that are randomly initialized and learned during the predic-tion training process.A weighted total of the read word annotations is then calculated to represent the forward context F i. The F i ð Þ is part of the attention layer output and can be described as: Sentence Level (Sentence Encoder with Attention): Word embedding is simple, efficient, and surprisingly good because of its simplicity and the assumptions of word independence.However, it does not consider the word order or the relationship between words in the sentence (Reimers and Gurevych 2019).Consequently, we used sentence level to extract the explicit features from the text of the reviews.By embedding (Si) of each sentence in the review, we learn the representation of the given sentence centred around Si by embedding the list of sentences through bi-directional LSTM.We combined h i ! and h i to obtain the annotation of the sentence (i).The (hi) summarised the neighbours' sentences around sentence (i) but still concentrated on sentence (i).
Again, similar to words, not all the sentences contribute equally to the representation of a review; we propose a sentence level attention mechanism to learn the weight of a sentence-level attention matrix Specifically, where b ð Þ and W ð Þ present the bias and weight in the neuron, tanh is the hyperbolic tangent function.The model employs a context vector with a similarity between u it and a sentence level to measure the significance of every sentence.Then, the sigmoid function was used to obtain the normalized weight a it ð Þ of each sentence.The a it ð Þ was computed as follows: where exp : ð Þ represents the exponential function.The context vector at the sentence level is a high-level representation of informative sentences that are randomly initialized and learned during training.A weighted total of the read sentence annotations is then calculated to represent the forward context s Finally, on the basis of literature (Xiong et al., 2020), this simple word or sentence level attention is when considered in isolation.Sentence-level attention extracts coarse-grained characteristics to provide a good understanding of the global features of the text.Additionally, word-level attention extracts fine-grained features, which give a good understanding of the local features of the text.Thus, to capture the fined grained and coarse-grained features, we merged sentence-level attention with word-level attention into a fully connected layer to obtain the final text classification.The whole process for attention-based word and sentence level is presented in Algorithm 1. Algorithm 1 (Parallel Bi-LSTM with attention for fine grained and coarse-grained features).

Reviewer-based feature extraction and classifier
The reviewers' features play a crucial role in detecting spammers; thus, they can be employed to detect fake reviews (Akram et al., 2018).The proposed model uses a rich source of features related to the reviewers' behaviour and in particular the spammers.We extracted these features using statistical measurements and equations as described in Table 3.(See Table 4) We used a Convolutional Neural Network (CNN) as a Classifier since it has been shown good performance in different tasks such as fake reviews detection (Guo et al., 2021).CNN is a sort of neural network commonly employed in image processing (Shang et al., 2020).However, it also proved an efficient performance with tabular and text datasets.As shown in Fig. 1, CNN has three layers: a convolution layer, a pooling layer, and a fully connected layer.CNN can include more than one pooling layer, convolutional layer, and fully connected layer.In the convolutional layer, filters are slid over the one-dimension sequence data and extracted features.Then, the extracted features are categorised into new features known as a feature map.Then, the pooling layer is used to reduce the dimension of the optimal features.Lastly, the pooling layer passes the feature into a fully connected layer for classification.
In our proposed model, after extracting the user features, the input data x ð Þ is passed into the convolutional layer, composed of three filters with one dimension.The reason for using one dimension is that we used numerical data as input for CNN.Convolutional layers perform a convolutional operation on the input x ð Þ, then pass the results to the pooling layer.The process can be expressed using the following equation: where h is the filter size, b is the bias, and f x ð Þ represents the nonlinear function.W j is the kernel convolutional weight vector of the j layer.Then, the pooling layer was used to calculate the maximum value from each cluster of neurons in the preceding layer b a ¼ max a f g, which can be used to filter out zero-padding.Then, we concatenated the results of the pooling layer to achieve the final representative vector.We use the sigmoid activation function for the last fully connected hidden layer.The final vector was used to predict the review label.The specific steps of detecting CNN are described below (see Algorithm 2).The product's features also play a crucial role in detecting fake reviews.This is because spammers usually have the same pattern of review content for particular products.For extracting the product features, we used the same behavioural features described in Table 3 as they are also viable to the product where we take into account all the reviews written on a product level only and exclude reviewers' level.One feature that cannot be applied is the first review ratio because the product has only one review at first.We also used a Deep neural network (DNN) as a classifier.DNN is a high-level network that can learn more complex and abstractive features than general neural networks (Al-Hawawreh et al., 2019).As shown in Fig. 3, DNN consists of the input layer, several hidden layers n ð Þ and an output layer o ð Þ.The main reason for using DNN is that it has proved to be effective computationally with tabular data (Al-Hawawreh and Sitnikova 2019).In our work, after extracting the product features, the input data x ð Þ is fed to the hidden layer where the Rectified linear unit (ReLU) was used as a non-linear activation function which can reduce the error gradient problem (Glorot et al., 2011).Then, we used the sigmoid activation function for the last fully connected hidden layer.The specific steps are described in Algorithm 3.(See Fig. 4.)

Feature Description Equation
Reviewer Deviation -F1 A spammer provides a distinct rating rather than a genuine one because the spammer's objective is to create a false review of a product in either a negative or positive way.Spammers generally give a high rating on a product, enabling us to identify fake reviews (Li et al., 2017).
This feature catches the behaviour of a spammer who spams a product evaluation shortly after it is released.This kind of spam would probably attract the attention of other spammers, who would then take advantage of the views of subsequent spammers (Rastogi et al., 2020).
We set rij ¼ i for the ith rating of a product

Maximum Rating Deviation-F3
Spammers usually give a high rating to a product that can assist in identifying fake reviews (Mukherjee et al., 2013, Li et al., 2017).
Posting too many reviews in a short time will be seen as an abnormal activity and could identify the user as a spammer (Mukherjee et al., 2012).This technique analysed the number of users reviews that the user had created in the preceding 24 h.The reviewer could be the spammer if the overall number of reviewers reached a threshold value (X = 28).
The researchers noted that spammers often wrote reviews early to plant spam as an early review can significantly influence people's opinions about a product (Rastogi et al., 2020).Product or reviewer is considered as targeted or spammer if the ERR values are close to 1 Where, ETF describes an early time frame of a review r (Mukherjee et al., 2013), ETF ra; p ð Þ describes how early the user a evaluated the product p.d is a threshold value for indicating earliness and its value 7 months, and b 3 = 0.69 Maximum Number of Reviews -F6 Previous studies have shown that 75% of spammers have produced over 5 reviews on certain days and 90% genuine reviewers posted one review in a single day (Li et al., 2017).Consequently, the number of reviews posted by each reviewer could be used to detect if a reviewer was genuine or spammer.

Ratio of Positive Reviews -F7
The ratio of positive reviews with a rating of 4 or 5 draws specific spammer who seem to promote specific companies (Li et al., 2011).

Percentage of Negative
Reviews -F8 We filtered users who are more likely to dismiss companies by computing the percentage of negative reviews to find the proportion of negative reviews with 1 or 2 ratings (Rayana and Akoglu 2015).
First Review Ratio -F9 Spammers are attempting to become the first reviewers to have a more substantial influence on misleading purchasers (Ong et al., 2014).
Behaviour -F10 Spammers attempt to provide low or high ratings to praise or criticize certain products intentionally (Mukherjee et al., 2013).The rating of one star or five stars is identified in the five-star rating system as extreme rating behaviour (Mukherjee et al., 2013).
As we discussed previously, writing a review early is an indication of a fraudulent review.If the reviewers have the majority of their reviews as topranking reviews, their behaviour can be deceptive (Rastogi et al., 2020).
Where f TRR describes the top rank of reviews and it is calculated as follows: Where c 2 is a threshold value which identifies whether or not the review is top ranked Bottom Ranked Reviews Percentage -F13 And F-14 Legitimate reviewers would generally evaluate a product or service once they have used it (Rastogi et al., 2020).Further, legitimate reviewers usually take time to evaluate products compared to spammers who rank consumer expectations early (Rastogi et al., 2020).
Where, F BRR describes the bottom rank of the review and it is calculated as follows: Where c 2 is a threshold value which determine whether or not the review is bottom ranked.

Maximum Content Similarity-F15
Spammers usually post the same reviews about distinct products to promote them because they do not put effort into producing new fake reviews (Akram et al., 2018).Therefore, it is crucial to determine the author's content similarities to identify the spammer.In our work, we used the cosine similarity method to calculate the highest similarity between the content of reviews ; wherer i ; rx 2 Ra; x < y  For genuine reasons, reviewers posted multiple reviews on the same product.Some divided their reviews into smaller ones (Lim et al., 2010).The author (Rastogi et al., 2020) found first personal pronouns such as ''we", ''my","I" and second personal pronouns such as ''your","you" in an indicator of a fake review.Therefore, we used them as real number features.
- We calculated these features by taking the average of AOW and ASW features separately over all the reviews of that product or reviewer (Rastogi et al., 2020).
It has been found that 90% of genuine users write reviews of more than 200 words on average 16].
They identified reviews as fake if they had a length of less than X.Thus, we calculated this feature as follows where X = 135

Aggregation Level-An ensemble technique
At the last level, we use an ensemble approach to combine multiple learning algorithms to provide more accurate predictions (Reddy et al., 2020).Specifically, the above deep learningbased models, i.e., DNN, Bi-LSTM, and CNN, are trained to produce a single strong predictive model in an ensemble approach.We used the voting approach to aggregate the performance of these different models and improved the model classification performance (Ekbal and Saha 2011).We mainly used soft voting due to its reliability, and it does not require parameter tuning as the individual models have already been trained (Ekbal and Saha 2011).In soft voting, as described in equation ( 20), the ensemble combines the predicted probabilities for the class label b y.Then, if the probability p was greater than a predefined threshold (th>=0.5), the input observation was classified as genuine; otherwise, it was fake.Return final label b y 13: End for

Experiments
In this section, we present the description of used datasets, and our model results, and then compare it with the state-of-the-art fake review detection models.

Dataset description and model structure
To evaluate the effectiveness of our model, we used two large real-world datasets, namely Yelp NYC and Yelp ZIP, from Yelp.com (Rayana and Akoglu 2015).These datasets are extremely unbalanced, as only about 10% of total reviews are fake.Yelp NYC dataset has 322.167 restaurant reviews located in New York City, and the Yelp ZIP dataset includes 608.598 reviews restaurant reviews located in New York, New Jersey, Vermont, and Pennsylvania.The reviews comprise product information, user information, timestamp, rating, and text reviews (Rayana and Akoglu 2015).
Moreover, we used the Keras library to implement our models.We used three architectures to extract different features from different views.For Bi-LSTM feature extraction, we used the publicly available pre-trained FastText method (Mikolov et al., 2017) for word embedding that outperformed Glove and word2vec embedding methods in the deception detection task (Nam et al., 2020) and has been used extensively in text classification tasks (Lai et al., 2015).Furthermore, Fast Text is a computationally efficient algorithm (Nam et al., 2020), and it works similarly to the Continuous Bag of Words method (CBoW) but uses a bag of N-grams instead of a bag of words to calculate embeddings, which allows it to capture some information about the word order (Rakhlin 2016).In this work, we used FastText consisting of a 2-millionword vector trained on Common crawl data.The dimension of this embedding is set to 300.The training batch size for all datasets was set to 64.The dropout was 0.5, and the regularisation coefficient was set to 0.001.The learning rate was set to 0.001 using the Adam algorithm (Kingma and Ba 2014) to train the parallel BI-LSTM model.The hidden layer of BI-LSTM was set to 2, and the hidden layer dimension was set to 150.During DNN training, we used 3 hidden layers, and the dimension was set to 64,32, and16, respectively.The dropout was 0.5, and the learning rate was set to 0.001 with the Adam algorithm.The final architectures are described in Table 5.It is worth mentioning that we performed many experiments to get the best parameters for our models based on the following performance metrics.
Precision: describes the proportion of related reviews from the total number of reviews and can be calculated by Recall shows the proportion of relevant reviews achieved from the total number of reviews and is calculated by F1 score shows the average of precision and recall and is calculated by

Data Pre-Processing
Products and reviewers with few reviews (e.g., less than 3 reviews) don't have distinguishable behavioural features (Rastogi et al., 2020).Therefore, we filtered out both product, and reviewers with less than three reviews.This method was repeated until at least three reviews were available for all reviewers and products.Further, it was essential to assign labels to both products and reviewers since our model was based on supervised learning for three levels (i.e., product, review, and reviewer).We assigned labels to reviewers depending on the reviews who posted them.If the reviewer had no filtered reviews, we assign the label as a non-spammer, and reviewers who had filtered reviews were assigned as a spammer.We assigned the labels for products by following the same method and we labelled it as non-targeted, otherwise, as targeted.In addition, datasets were pre-processed to remove noise like URLs, stop words, emojis, etc.The NLTK toolkit was used for pre-processing (Aghakhani et al., 2018); it is an Open-Source library that is widely used.We use the following pre-processing techniques: Tokenization: This method divides the text into the list of tokens, considering it the first stage in natural language processing prior to feature extraction (Asghar et al., 2020).Stemming (Lemma): Stemming is a method that reduces words to their root.The main goal is to decrease the frequency of derived words.For instance, words like playing, played and play are reduced to the word play.For this reason, we used the most common stemming method, which is the Porter Stemmer algorithm (Willett 2006).Porter Stemmer algorithm is a technique for deleting the more common morphological and inflexional ends from English words.SMOTE (Chawla et al., 2002) is a process of over-sampling, eliminating the impact of an imbalanced dataset by producing samples from minority classes.The SMOTE method uses a K-nearest neighbours (K-NN) algorithm instead of simply duplicating minority samples to generate synthetic minority samples, thus improving the sampling stability (Sun et al., 2009).We used the SMOTE method to make the data balanced as this method achieved good results in text classification (Sun et al., 2009).
After we performed the pre-processing steps mentioned above, we create the final datasets which are described in Table 6 that shows the statistics of the dataset after the pre-processing.

Our proposed model results
This article proposes an ensemble model including parallel Bi-LSTM with attention, CNN and DNN due to multifarious reasons.First, we used Bi-LSTM, which contains two LSTMs for taking the input in forwarding and backwards directions, which can effectively capture the past and future text.To focus on the most important information that can improve the classification accuracy, we combined Bi-LSTM with an attention mechanism.Second, we incorporated the CNN algorithm in our ensemble model because it has been shown good performance in detecting fake reviews (Guo et al., 2021).Lastly, we used DNN because it proved to be effective computationally.In addition to this, the sequential analysis has been captured by using Bi-LSTM with attention.
The experimental results on Yelp NYC and Yelp Zip datasets were performed on the UTAS cluster and Colab pro. 2 The programming language used for the experimentation was Python 3.7.In our work, for Yelp NYC and Yelp Zip datasets, 80% of the dataset was used for training, and the remaining 20% of the dataset was used to test the model.
To evaluate the proposed model, we discussed the performance of parallel Bi-LSTM, CNN, DNN and ensemble models on Yelp NYC and Yelp Zip datasets in terms of recall, precision, and F1-score.As can be seen in Table 7, the recall, precision, and F1-score for parallel Bi-LSTM algorithms were 77.59%, 63.61% and 69.91% on the Yelp NYC dataset and 63.88%, 62.80% and 63.34% respectively on the Yelp ZIP dataset.While the results for CNN and DNN algorithms on the Yelp NYC dataset were as follows: for the CNN algorithm, precision, recall, and F1-score are 66.77%, 79.64% and 72.64%, respectively, for DNN algorithm, precision, recall, and F1-score are 83.56%,80.26% and 81.88% respectively.The experimental results of CNN and DNN algorithms on the Yelp Zip dataset were as follows: for the CNN algorithm, precision, recall, and F1-score are 67.90%,71.60% and 69.70%, respectively, whereas, for the DNN algorithm, precision, recall, and F1-score were 84.74%, 71.67% and 64.98% respectively.It is clear that the results of combining different multi views in terms of features and in terms of the classifier are better than these models separately.This proves the importance of considering multi-view features for fake reviews detection.
Furthermore, to evaluate the classification algorithms' performance, the Receiver Operator Characteristic (ROC) curve was used (Carrington et al., 2022).The ROC showed the relationship between the False Negative Rate (FNR) and the True Positive Rate (TNR).Figs. 5 and 6 show the performance of parallel Bi-LSTM, CNN and DNN based on the ROC curve for each dataset where FNR and TPR can be calculated as follows: As presented in Fig. 5(a.4) and Fig. 6 (b.4), the proposed model achieved 85.51% and 81.49% for the ROC curve in Yelp NYC and Yelp Zip.As shown in Fig. 5 (a.2) and 6 (b.2), it is clear that product features outperform the textual and user features on both datasets by maintaining a higher positive percentage than false positives.Further, we found that behavioural characteristics are vital for both datasets.This indicates that the metadata information contains  more powerful fake signatures than those in the contents.This prominence covers all three contexts of spamming, whether spammer, fake review, or targeted product.We recognise why this contrast is intuitive.A spammer could usually be thought of as ''conscious" about the content they posted, so that their review seemed ''true".For example, a spammer would seek to simulate a genuine reviewer in order to reduce their use of imaginative words (Ott et al., 2011) or to maintain their review contents more subjectively (Li et al., 2011).However, they may not be ''conscious" of the behavioural indications that they leave when posting reviews.
6. Comparison with the State-of-the-Art Models.
To demonstrate the effectiveness of our novel model in detecting fake reviews, we compare its performance with those of four recently developed fake review detection models, namely, TensorD (Melleng et al., 2019), SWNN (Li et al., 2014), ABNN (Lai et al., 2015), and AEDA (Mikolov et al., 2017).We used the Area Under the Curve (AUC) as an evaluation metric, which is the key performance metric in the literature review of fake review detection (Ekbal and Saha, 2011).
Table 8 demonstrates the result achieved by our proposed model for both Yelp NYC and Yelp ZIP datasets.It is clear that our proposed model obtained the best AUC for both datasets compared with other detection models.It achieved 85.51% and 81.49% for Yelp NYC and Yelp ZIP datasets.The four models also achieved good performance in terms of AUC.For example, TensorD automatically generates 11 relationships from the perspective of users and products based on two fundamental rules.This method uses a decomposition of the tensor to map reviewers and products to a vector area and an SVM to categorize the integration of reviews.Therefore, TensorD only achieved 79.05% and 80.97% for Yelp NYC and Yelp ZIP datasets.However, as we explained earlier in this paper, utilizing features related to product and reviewers only is

Table 8
Comparison Among the Existing Fake Reviews Detection Models on The Yelp NYC And Yelp ZIP Datasets.
Moreover, SWNN used CNN to extract information from the review, and then a Kl-divergence method was used to extract the important weight of the words.Finally, a weight pooling layer was used to convert the sentence vector to a document vector.Although SWNN achieved a good performance in detecting fake reviews in the Yelp ZIP dataset (i.e., 81.25%), it has less performance with the Yelp NYC dataset (i.e., 78.57%).The key reason for this unstable performance is adopting review content (textual features) only.These features are dynamic and very noisy data.Therefore, further analysis is required to improve the performance with diverse datasets.
Similarly, Attention Based Neural Network (ABNN) has the same performance.It jointly embeds the behaviour and linguistic features for fake review detection and achieved diverse performance with both datasets.While AEDA also achieved the same performance compared with the previous two models.It gained 78.92% and 81.32% for Yelp NYC and Yelp ZIP datasets.AEDA used an embedded neural network to extract features from different domains and SVM to identify fake reviews.In addition, SAE used the sentiment and emotion-based representation of review for detecting the fake ones.It depended on a classical machine learning algorithm (i.e., Random Forest) which achieved less performance in dealing with complex review patterns.It achieved 77.86% and 79.33% for Yelp NYC and Yelp ZIP datasets, respectively.
However, our proposed model performs better than the above models, relying on review content features and product and reviewers' behaviour statistical features.It is used sentence-and word-level to extract the explicit and implicit features from the review content and uses BI-LSTM deep learning techniques, which are powerful in dealing with complex data sequences.Also.Our model depends on CNN and DNN deep learning algorithms to detect fake reviews based on reviewers and spammer behaviour.Our proposed model relies on multi-perspective features which combine and correlate various features to create the fake review profile.In addition, our proposed model uses deep learning in the classification task, which improves the performance of the detection model in identifying the hidden pattern of a fake review.On top of that, our model also uses the ensemble approach to improve the identification of the fake review process.Our model works automatically and without human intervention in selecting features that prove it would be a successful real-world solution.

Explainability
Using machine learning-or deep learning-based models can enhance the decision-making process for customers and businesses.However, determining confidence in individual predictions of these models is a significant problem.For example, machine learning models cannot be trusted in sensitive research areas like medical, defence, finance due to the level of complexity.Consequently, there is a need to explain machine learning models to make them more reliable and responsible.Thus, both customers and businesses can have more visibility into how a machine and deep learning-based fake review detection models found their final output and they therefore can trust the models and their decision.Therefore, in this work, we provide an explainable fake review detection model.We use the post-hoc explainable technique to explain the obtained results by our models and why it achieves the best performance in detecting fake review.Specifically, we use the Shapely Additive Explanations (SHAP) method (Lundberg and Lee 2017) as it has a solid theoretical foundation that provides local accuracy and consistency in real-world scenarios (Lundberg et al., 2018), and it is an agnostic interpretability technique that can be used to explain any deep learning model.This method computes each feature's contribution to the prediction of a particular instance by computing the shapely value.The shapely value calculates each feature's impact by computing the average marginal contribution of a feature across all possible incorporations.For example, assuming the CNN model where a group F (with M features) is used to predict an output t(N).In the SHAP method, the impact of each feature (/ i istheimpactoffeaturei) on the CNN model output is assigned based on their marginal contribution (Shapley 1953).Based on various axioms, Shapley values are computed through: where F is the set of input features, S is the subset of input features, and M ¼ F j j is the number of input features.
Where, v S S i f g ð Þ represents output when the i th feature is present, v S ð Þ describes output when the i th feature withheld is the weighted average of all possible subset of S in N.
The following additive method defines a linear function of binary features g: where z 2 0; 1 f g M , equals to 1 when a feature is observed, otherwise it is equal to 0, and M is the number of input features.For the parallel Bi-LSTM model, we use the attention mechanism to extract the most important features that affect the output model prediction.
7.1.Feature analysis for CNN and DNN models using SHAP method Figs. 7, 8, 9, and 10 shows the SHAP summary plot of the importance of ordering features to detect fake reviews on Yelp NYC and Yelp ZIP datasets where shapely values on the X-axis and features on the Y-axis.Low/high values are represented by colours.The features in the summary plot are organized in order of significance, with the top feature being the most important and the bottom feature being the least important.Fig. 7 shows that a maximum number of reviews (MNR) feature is the most important feature that has the most significant impact on CNN model prediction on the Yelp NYC dataset.This observation is most likely related to the spammers that usually create more than five reviews in one day (Mukherjee et al., 2013).Furthermore, larger values for this feature give rise to greater SHAP values that are more likely to cause the creation of the reviews.Figs. 8 and 10 show that the early review ratio (ERR) feature is the most important feature that has the most significant impact on DNN model prediction on Yelp NYC and Yelp ZIP datasets.This observation is most likely related to posting a product review, which has a major effect on the customers' perception of the product and spammers usually post early review on the new released product to affect the customers' decision (Rastogi et al., 2020).Fig. 9 shows that the burstiness (RBST) feature is the most important feature that has the greatest impact on CNN model prediction on the Yelp ZIP dataset.This observation is most likely related to the spammers that usually post too many reviews in a short time (Mukherjee et al., 2013).
SHAP theoretically provides high accuracy even though the accuracy comes at a cost in terms of execution time.A further  advantage is that SHAP comes with pre-optimised techniques for explaining specific kinds of models (Lundberg et al., 2018).
As seen from Table 9, the processing time of SHAP ranges from 1.22 to 15 min for different sizes of the dataset (100-1000).
Considering the limited resources that we have in running our experiments, this time can be acceptable.However, this time can be extremely reduced given the existing high-performance computing, and the parallelisation technique.Attention mechanism proved its effectiveness in generating reasonable explanations in text classification by capturing the intrinsic explainability of long text (Shu et al., 2019).Therefore, in this sub-section, we used attention weights to determine the most important words and sentences that affect the final decision.Table 10 shows sample words with their corresponding labels for the Yelp NYC dataset.The column ''important words" describes the most important words captured for attention.Similarly, Table 11 shows sample sentences with their corresponding labels for the Yelp NYC dataset.The column ''important sentences" describes the most important sentences captured by attention mechanism.Table 12 shows sample sentences with their corresponding labels for the Yelp ZIP dataset.Similarly, Table 13 shows sample words with their corresponding labels for the Yelp ZIP dataset.As we can analyse from the explainability results, our proposed model has the advantage to interpret the model classification process.Particularly, the explainability results demonstrate the most significant words and sentences features that affect the model performance.Specifically, we have the following observations: When significant words and sentences are correctly identified, the classification results are correct.This demonstrates that the attention layer is effective for fake reviews detection.The highlighted words and sentences have been considered the most significant features by the attention mechanism layer.Great burgers, reasonable prices and small quaint atmosphere.Â Had Lunch here and had the Bronte burger and it was cooked perfectly and was very savory Explainability can influence the researchers positively in this research area to visualise and analyse machine learning algorithms performance closely with better understanding.

Conclusion and future work
In consumer purchasing decisions, online reviews are crucial.There is increasing concern about posting fake reviews to mislead the client.Several fake review detection methods have been developed.However, many models still suffer from low accuracy because they focus on single perspectives, such as review texts or reviewers.Additionally, most models ignored some user implicit expression patterns and the influences among reviewers, products, and texts, leading to failure in detecting fake reviews.Therefore, developing new detection models that efficiently identify fake reviews reliably is of utmost importance.This paper proposed a new model that combined features from different perspectives, including review content, reviewers, and products.Our model is an ensemble multi-view deep learning model that consists of 1) parallel word and sentence Bi-LSTM architectures with an attention mechanism to extract the implicit and explicit features from the text of the reviews.2) To extract the review information, we used a convolutional neural network (CNN).3) To extract the product level information, we used a deep neural network (DNN).Our experiments on two real datasets from Yelp.com showed that the classification performance had been significantly enhanced.Our proposed model outperformed the state-of-the-art techniques for the fake reviews detection task and proved the model's efficiency for the two datasets.Finally, we have introduced the expandability module, which helps understand the underlying logic of a model and provides other hints if it is ''unfair".The results prove that our proposed model is reliable and correctly identifies fake reviews.
Despite the comprehensive experimental research described in this article, there is still potential for improvement.We plan to evaluate the proposed method's effectiveness on cross-domain datasets in future work.Further, we will examine whether the knowledge gathered from the SHAP and Deep Learning state-ofthe-art architectures can be used to design an interpretable lexicon-based classifier.Moreover, we intend to study a novel concept involving the integration of multiple classifiers and multiple feature extraction methods to improve accuracy on larger datasets.
This study has some drawbacks that may limit the generalizability of its findings and create opportunities for future study.First, the suggested model has been evaluated on only three distinct datasets from multiple systems.Different outcomes may have been observed if the model had been evaluated on other platforms, such as Walmart, Alibaba, eBay, or TripAdvisor.The results may not be directly transferrable to other datasets because some features incorporated into our predictive models may be unavailable.To overcome these limitations, a subsequent study must examine the role of additional elements of fraudulent reviewers' behaviour, and the methodology must be validated using numerous datasets acquired from additional online platforms.If we apply the same features to new datasets, there is a substantial danger that the model's trustworthiness may be compromised.However, we propose that the features identified in this study could be generalised to other online e-commerce platforms, a topic deserving additional examination and future study.In addition, we intend to conduct a more in-depth analysis of review-centric data to derive other innovative features that could increase the precision and explainability of machine learning models.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Fig. 1 .
Fig. 1.The architecture of Ensemble Multi View Deep Learning Model.

Fig. 2 .
Fig. 2. The architecture of LSTM.The weight matrices are described by arrowed lines.
Algorithm 2 (Training CNN model-based classifier for detecting fake reviews based on spammer behaviour).Input: User-related features ((u)_r) Output: spammer prediction report Process: 1: begin 2: Extracted features 3: Set the parameter for the network Set the value of the hidden unit Batch size Epoch value 4: Add callback 5: For e = 1 to n. //n is the number of epochs 6: Train the CNN network with Adam optimiser 7: End for 8: Test the model 9: Obtain the prediction 10: Return the prediction reports 3.3.Product description Level.
letters in the review Ranking order of review Displays the decay rate for parameters greater than one.Current review Total reviews created by the users Product p rating given by the user a The average product p rating specified by all users than user a R. Mohawesh, S. Xu, M. Springer et al.Journal of King Saud University -Computer and Information Sciences 35 (2023) 101644 Algorithm 3 (Training DNN model-based classifier for detecting fake reviews based on product).Input: Product-related features ((p)_p) Output: Targeted product prediction report Processe = 1 to n. //n is the number of epochs 6: Train the DNN network with Adam 7: End for 8: Test the model 9: Obtain the prediction 10: Return the prediction reports

Fig. 5 .
Fig. 5. Performance of Parallel Bi-LSTM, CNN and DNN based on the ROC curve for Yelp NYC dataset.

Fig. 6 .
Fig. 6.Performance of Parallel Bi-LSTM, CNN and DNN based on the ROC curve for Yelp ZIP dataset.

Fig. 7 .
Fig. 7. SHAP Summary Plot of CNN model on the Yelp NYC dataset.

Fig. 8 .
Fig. 8. SHAP Summary Plot of CNN model on the Yelp ZIP dataset.

Fig. 9 .
Fig. 9. SHAP Summary Plot of DNN model on the Yelp NYC dataset.

Fig. 10 .
Fig. 10.SHAP Summary Plot of DNN model on the Yelp ZIP dataset.

Table 1
Summary of existing fake reviews detection methods.

Table 2
Integral features examples of product and user levels.
994This Please has the BEST PIZZAS around!.... 1156 104535 Great place!Always an enjoyable experience, good staff and great food.Love the fish.4431104541Great price, great service, and most of all delicious food! . .... 104539 Great food, so delicious.Great atmosphere.Friendly service. . . .104537 Great food!Especially the big fish for two, the pink snapper is very tasty and fresh and the lamb chops are delicious. ....

Table 3
Extracted behavioural features.

Table 3
(continued) wherer i ; rx 2 Ra; x < y The proposed ensemble model is presented in Algorithm 4.

Table 5
An optimal-performance architecture of the proposed models.

Table 6
Statistic information about the datasets after pre-processing.

Table 7
Classification report for Yelp NYC dataset and Yelp ZIP.

Table 9
SHAP processing time of different size of dataset using a single processing.

Table 10
Example of explanation for correct predictions of three reviews text on Yelp NYC dataset.We show the top important words in three reviews.The Label 0 indicates a fake review, whereas label 1 indicates a genuine review.Attention values indicate to what extent are these words important for prediction.went here with a my coworkers for lunch and this small cafe instantly won my heart and stomach.The food is delicious (we shared appetizers which were just great), the staff was friendly, service quick, music from great Pandora station."this early this year with my colleagues after work.very busy place but great services.All foods are just awesome!"This place is great.3 people À 8-10 tapas plates.Every single one was a homerun.Sitting and watching the food being cooked made it that much better.Don't like eating at cramped tight spaces, but it made the experience more personal.Had to wait for about 1 h which was great because we were sent around the corner to Bar Jamon.Another GREAT experience."

Table 11
Example of explanation for correct predictions of three reviews text on Yelp NYC dataset.We show the top important sentences in three reviews.The Label 0 indicates a fake review, whereas label 1 indicates a genuine review.Attention values indicate to what extent are these sentences important for prediction.Don't miss this excellent italian restaurant if you are in town.Located in a post industrial building with redbrickstone, simple decor and glass see through kitchen, this place is a must be.Menu is rich and full of options and changes periodically.We had as a starter the green salad with a sort of feta cheese and oil and lemon temper...simply to die for.Pasta dishes are spectacular.Carbonara was just perfect.Well cooked al dente with the right amount of sauce and top quality tasty ingredients.Finally not to be missed their grilled chicken simple but excellent.Overall a special italian restaurant in Manhattan" this excellent italian restaurant.Well cooked al dente with the right amount of sauce and top quality tasty ingredients.Finally not to be missed their grilled chicken simple but excellent 1 ''Is the burger here delicious?Absolutely Would I say it's the best burger in NY?No way.Sorry, but my heart still belongs to Shake Shack . . .or even Greasy Nicks up in Westchester.As a testimony, I'll say that I've lived by Is the burger here delicious?Absolutely Would I say it's the best burger in NY.
have to be?It's not like people will stop flocking to Corner Bistro because a waiter was curt.But for me personally, grouped will all my aforementioned qualms, I would prefer a more full dining experience than the one I've had at Corner Bistro.Of course, when it's 4 am on a weekend and I'm hungry, everything I've said goes totally out the window ...oh well"1''An Australian Hamburger Joint!Great burgers, reasonable prices, and small quaint atmosphere.Had Lunch here and had the Bronte burger and it was cooked perfectly and was very savory".

Table 12
Example of explanation for correct predictions of three reviews text on Yelp ZIP dataset.We show the top important sentences in three reviews.The Label 0 indicates a fake review, whereas label 1 indicates a genuine review.Attention values indicate to what extent are these sentences important for prediction.Nice place, nice staff, very good place, a real place to go for fine food!"This place is pretty legit.The food was great (and actually made spicy!) and the ambiance is what one can expect in these tiny Thai places along 9th and 10th Ave.Â Seating is a bit awkward if you get placed at the small bar like area instead of at a table so make sure you're ok with having to turn to face your eating companion to talk.The set up seems particularly ideal for people dining alone so I may just try that next time!"Table 13 Example of explanation for correct predictions of three reviews text on Yelp ZIP dataset.We show the top important words in three reviews.The Label 0 indicates a fake review, whereas label 1 indicates a genuine review.Attention values indicate to what extent are these words important for prediction.Mesob is ok.Â It has a great atmosphere and smells amazing, plus it's such a unique place to visit (where else do you eat with your hands besides a sports game?).Â But I've been to other Ethiopian restaurants, even in NJ, particularly New Brunswick.Â And I have to say, Mesob just doesn't cut it -it's like Spanish food to me.Â Really, I could season ground beef and say it's Ethiopian too.Â It's a neat little place to take someone visiting Montclair, but don't make it a habit.Â Drive or take the train to New Brunswick and try Makeda Ethiopian!"This place is fun and relatively inexpensive.The atmosphere is incredible and the photo booth in the back can make for some great fun.Just don't forget your photos in the machine..The wait staff will throw em on the wall for all to see.(Made that ''mistake" a couple times) I recommend: De La Hoya guacamole Blood Orange Margarita Pollo Annatto I will admit, it is one of my favorite places in Philadelphia"