Care more about customers: Unsupervised domain-independent aspect detection for sentiment analysis of customer reviews
Introduction
With the rapid growth of user-generated content on the internet, the number of customer reviews that a product or service receives grows rapidly. A significant number of websites, blogs and forums (e.g., www.amazon.com, rottentomatoes.com, epinions.com) allow customers to post opinions about a variety of products or services. This online word of mouth behavior introduces a new and important source of information for business intelligence and marketing. In the other words customer reviews are essential to other potential customers, retailers and product manufacturers (potential users) in their efforts to understand the general opinions of customers and help them to make better decisions. As the number of customer reviews expands, it becomes very hard for users to obtain a comprehensive view of opinions of previous customers about various aspects of products through a manual analysis. Consequently proper analysis and summarization of customer reviews can further enable potential users to visualize previous positive and negative opinions about specific features or aspects of products. Therefore it is highly desirable to produce an automatic analysis or summary of customer reviews.
For the past few years, sentiment analysis (or opinion mining) for online customer reviews has attracted a great deal of attentions from researchers of data mining and natural language processing [1], [3], [5], [7], [8], [11], [9], [24], [25], [27], [33].
Sentiment analysis is a type of text analysis under the broad area of text mining and computational intelligence. Three fundamental problems in sentiment analysis are: aspect detection, opinion word detection and sentiment orientation identification [24], [27], [33].
Aspects are topics on which opinions are expressed. In the field of sentiment analysis, other names for aspect are: features, product features or opinion targets [3], [5], [7], [8], [6], [12], [24], [27], [33]. Aspects are important because without knowing them, the opinions expressed in a sentence or a review are of limited use. For example, in the review sentence “after using iPod, I found the size to be perfect for carrying in a pocket”, “size” is the aspect for which an opinion is expressed. Likewise aspect detection is critical to sentiment analysis, because its effectiveness dramatically affects the performance of opinion word detection and sentiment orientation identification. Therefore, in this study we concentrate on aspect detection for sentiment analysis.
Existing aspect detection methods can broadly be classified into two major approaches: supervised and unsupervised. Supervised aspect detection approaches require a set of pre-labeled training data. Although the supervised approaches can achieve reasonable effectiveness, building sufficient labeled data is often expensive and needs much human labor. Since unlabeled data are generally publicly available, it is desirable to develop models that work with unlabeled data. Additionally, due to variety and wide range of products and services being reviewed on the internet, supervised, domain-specific or language-dependent models are often hard to apply. Therefore we conclude the framework for the aspect detection must be robust and easily transferable between domains or languages.
In this paper, we present a novel unsupervised model which addresses the core tasks necessary to detect explicit and implicit aspects from review sentences in a sentiment analysis system. Our model differs from existing techniques in that it requires no labeled training data or additional information, not even for the initial seed information. Therefore the model can easily be transferred between domains or languages. The proposed model is based on the observation that there is inter-relation information between the aspects in reviews. Inter-relation information is the probability of the co-occurrence of two aspects in a review. Therefore the model explores review dataset by using both frequency-based and inter-relation information to find the aspects. Furthermore we have found that opinion words and aspects themselves have relations in opinionated sentences. Finally the model uses explicit extracted aspects and opinion words to detect implicit aspects.
In the remainder of this paper, Section 2 gives a definition of the aspect-level sentiment analysis, detailed discussions of existing works on aspect detection will be given in Section 3. Section 4 describes the proposed aspect detection model for sentiment analysis, including the overall process and specific aspects of the design of the workflow. Subsequently we describe our empirical evaluation and discuss the major experimental results in Section 5. Finally we conclude with a summary and some future research directions in Section 6.
Section snippets
Aspect-level sentiment analysis
Opinions can be expressed about anything, e.g., a topic, a product, a service, an individual, an event, an organization or any attributes of them. Hence we use the notation of aspect to denote the target object that has been evaluated. An opinion (as expressed by means of opinion words) is a positive or negative sentiment, attitude, emotion or appraisal about an aspect. Positive and negative are called sentiment or opinion orientations [10], [6]. In general there are two types of reviews:
Related works
Several methods have been proposed, mainly in the context of product review mining in a broad range of study fields, from document to aspect level sentiment analysis for standard, ironic or spam reviews [3], [7], [8], [6], [12], [16], [21], [27], [33], [18], [19]. In the review mining task, aspects usually refer to opinion targets and product features, which are defined as product components or attributes. Existing aspect and product feature extraction techniques use both supervised and
Aspect detection model for sentiment analysis
Fig. 3 gives the architectural overview of the proposed model used for detecting explicit and implicit aspects in sentiment analysis. The basic hypotheses in this model are about using frequency-based and inter-relation information of the aspects together, employing the influence of an opinion word in the review sentence and giving more importance to multi-word aspects. This model proves using these hypotheses all together attain to highly effective results for product aspect extraction.
The
Experimental results
In this section we discuss the experimental results for the proposed model and presented algorithms. To report the effectiveness of our model first we evaluate the results for each individual step in of our model, and then we compare the results with the benchmarked results by Wei et al. [27] and Somprasertsri and Lalitrojwong’s [21]. Finally we discuss about identification of implicit aspects. In the following, data collection, evaluation measures and important evaluation results will be
Conclusions
In this research we study sentiment analysis and opinion mining for online reviews. When dealing with mining online reviews, it is often expensive and time consuming to construct labeled data for training purposes and it is desirable to develop a model or algorithm that can do without labeled data. In this paper we therefore proposed an unsupervised domain- and language-independent model for detecting explicit and implicit aspects from the reviews. The proposed model is able to deal with three
Acknowledgments
The authors thank Dr. Djoerd Hiemstra for his invaluable comments and suggestions and gratefully acknowledge the hospitality offered to the first author by the Human Media Interaction (HMI) group at the University of Twente. The research of the last author of this paper is partially supported by the Dutch National FES Program COMMIT.
References (35)
- et al.
Detecting implicit expressions of emotion in text: a comparative analysis
Decision Support Systems
(2012) - et al.
From humor recognition to irony detection: the figurative language of social media
Data & Knowledge Engineering
(2012) - et al.
Making objective decisions from subjective data: detecting irony in customers reviews
Decision Support Systems
(2012) - et al.
Developing corpora for sentiment analysis and opinion mining: the case of irony and Senti-TUT
IEEE Intelligent Systems
(2013) - et al.
An unsupervised aspect-sentiment model for online reviews
Accurate methods for the statistics of surprise and coincidence
Computational Linguistics
(1993)- et al.
Multi-aspect sentiment analysis for Chinese online social reviews based on topic modeling and HowNet lexicon
Knowledge-Based Systems
(2013) - et al.
A statistical approach to star rating classification of sentiment
Management Intelligent Systems
(2012) - et al.
Mining opinion features in customer reviews
- et al.
Mining and summarizing customer reviews
Opinion observer: analyzing and comparing opinions on the web
A survey of opinion mining and sentiment analysis
Mining Text Data
Weakly supervised joint sentiment-topic detection from text
IEEE Transactions on Knowledge and Data Engineering
Building a large annotated corpus of English: the Penn Treebank
Computational Linguistics
ILDA: interdependent LDA model for learning latent aspects and their ratings from online product reviews
Automatic term recognition based on statistics of compound nouns and their components
Terminology
Cited by (119)
Cross-Domain Aspect Detection and Categorization using Machine Learning for Aspect-based Opinion Mining
2022, International Journal of Information Management Data InsightsCitation Excerpt :They used lexicon for finding sentiment polarity. Bagheri, Mohamad and Franciska (Bagheri, Mohamad and Franciska 2013) proposed Aspect Detection Model based on LDA (ADMLDA). This model is based on Markov Chain and doesn't consider bag of words.
The impact of COVID-19 on tourism: Analysis of online reviews in the airlines sector
2022, Journal of Air Transport ManagementValue co-creation and co-destruction in service ecosystems: The case of the Reach Now app
2021, Technological Forecasting and Social ChangeUsing a hybrid content-based and behaviour-based featuring approach in a parallel environment to detect fake reviews
2021, Electronic Commerce Research and ApplicationsIntelligent product redesign strategy with ontology-based fine-grained sentiment analysis
2021, Artificial Intelligence for Engineering Design, Analysis and Manufacturing: AIEDAMWeakly Supervised Learning Approach for Implicit Aspect Extraction †
2023, Information (Switzerland)