Monitoring Events of Market Competitors: A Text Mining Method for Analyzing Massive Firm-Generated Social Media

: Understanding how competitors act in a market is a critical component of strategic decision-making. In this paper, we propose a method to extract ﬁrm events from the textual content generated by ﬁrms in the market and explore the competitive relationships among ﬁrms based on the spatiotemporal homogeneity of events of different ﬁrms. To this end, we ﬁrst introduce experts to deﬁne a series of business events based on the content of corporate-generated texts; then, we propose algorithms to extract and enrich the feature words (triggers) of these business events to form better event classiﬁers. We subsequently use these classiﬁers to identify the business events recorded in all online texts published by companies. Finally, based on these results, we can obtain a sequence of activities/events for each ﬁrm in the market, which can be used to identify the evolutionary patterns of ﬁrms’ behavior in the market, as well as their potential competitive relationships. Considering that competition between companies in the market appears to be continuous at the strategic level, but the implementation of competitive behavior is expressed through their “events” in the market, identifying whether companies are “competing” in the market requires timely observation of the information about “events” in the market. However, obtaining accurate market information is complex and costly. Therefore, this study provides a way to bridge the gap between social media data and market competition “events”.


Introduction
Understanding market competition is important for firms, consumers of products, and investors [1]. In the literature on management, firms have adopted two main streams of approaches to competitor identification, i.e., the supply-based [2,3] and the demand-based approaches [2]. The former is based on the firms' attributes to divide the competitor, while the latter is based on the customers' attributes. The growing complexity of organization and market structures in modern industries brings new issues for managers in identifying competition; for example, it has been found that not all firms will take the same competitive action simultaneously [4]. Thus, another new research stream has been developed to identify competitive relationships between firms by judging the similarity of their activities in the market [2,5]. In this approach, the term competitors refers to a set of firms in a market that share similar behaviors/activities [4,6]. However, previous methods rely on the data collected from traditional sources, such as market research [7], which may be outdated and time-consuming to analyze manually [2,8]. Particularly, such data may lead to "managerial myopia" in identifying competitive relationships [8]. Therefore, a critical step in analyzing market competition is to acquire and analyze relevant data that can effectively facilitate scanning the activities of firms. In this study, we obtain data publicly posted by companies on social media and introduce text analysis techniques to extract events from them, thus providing timely observation data for identifying potential competition between companies.
With the rapid development of Internet technology, social media has been widely adopted in business environments [9]. For example, firms can use social media to deliver information [10], communicate with followers [11], and build relationships with consumers [12] and other organizations [13]. All the messages posted by firms on their official social media pages are firm-generated content [14]. As a result, microblog platforms, such as Twitter.com and Weibo.com, have provided abundant and timely firm-generated content. These contents are usually related to firms' business events [15], recording a series of events that happened in firms, and showing great value in the form of business strategies [16]. Therefore, it is a feasible way to collect data from social media and extract further information about business events in a market [17].
However, extracting business event information from massive microblog data presents several challenges [18]. First, we know that retrieving business events from massive online texts is a computer-aided computing task; however, there is no formal or precise definition of business events in the field of text information retrieval [19]. Second, the textual data crawled from websites are very noisy, and terms may be varied and ambiguous [20]. These problems make it necessary for us to present a well-designed approach to explore event information from natural language [21]. Along this line, the following questions thus should be carefully addressed: RQ1: How can we identify a business event from massive online text data? RQ2: How can we explore the event revolution of competitors in a market?
In this paper, we focus on competition at the level of the market behavior of firms, i.e., firms take similar market actions in a defined market, targeting the same users at the same time, in what is called competition (events). To this end, we present a framework to extract business events from massive microblog texts, and then, based on the extracted information, we propose a novel method to explore the event information among competitors in a market. Our contribution to the existing literature is threefold. First, we propose a method for extracting business events from massive firm-generated text. The method consists of three tasks: expert annotation of events in the seed text, feature word extraction, and feature word enrichment, so that we can build a better event classifier to recognize the events recorded in the text. Second, we propose a method to follow the evolution of various events within a particular company, which can be used to observe the behavior of companies in the market from the perspective of time. Third, by defining competitors in a market as a group of firms with similar actions/activities [4], we propose a simple and effective method to explore the competition between companies at the level of market event.
The paper is organized as follows. Section 2 gives an overview of the literature on business event detection and market intelligence with social media. Section 3 details the methods for collecting data, extracting business events, and exploring market competition. Section 4 summarizes the results. Section 5 identifies the managerial implications and the limitations. Section 6 summarizes this work.

Social-Media-Based Market Intelligence
One early work in market intelligence (MI) is the BrandPlus platform [22], which explores consumers' buzzwords about brands and companies. An automatic method by Netzer et al. [23] identified which brands are discussed in consumer forums for the markets of sedan cars and diabetes drugs. Xu et al. [24] focused on extracting comparative relationships from Amazon customer reviews. Wu et al. [25] developed a recommendation system based on the historical weblog posts of users. Recently, many scholars have conducted studies of information extraction from social media for business reasons, and most studies have been focused on the commercial value of enterprise microblogs, such as brand analysis [26], marketing promotion [27], and customer engagement [28]. For example, Onishi and Manchanda [29] specified a log-linear system for market outcomes (sales) and the volume of blogs, and their results suggested that new and traditional media act synergistically. The analysis results in Colicev et al. [30] show that user-generated content has a stronger relationship with awareness and satisfaction, while firm-generated content is more effective for consideration and purchase intent.
Although monitoring business processes has become a major practical concern [31], scant attention has been paid to firms' market behavior from the perspective of firmgenerated content. Unlike previous work, this study focuses on firms' business events in the market, which differs greatly from unorganized personal or social events in terms of their aims, contents, and objectives [32]. Business events may constitute some observable actions or circumstances (e.g., a new competitor's product) [33]. In particular, due to the nature of our particular context, these social news-based topics frequently change according to real-world events [21]. However, the literature does not provide a method to represent business events in an easy-to-calculate way [33], which motivates us to fill this gap with text mining technology.

Market Competition and Firm Action in a Market
In both academic and commercial literature, the main tool for explaining rivals' behavior is game theory models [34], in which it is presumed that all players use the same basic principles to take strategic actions in a market [35]. However, in the real world, game theory models become unwieldy when a competitor has many options for management actions or when there are multiple competitors, each of whom might react differently [36]. Therefore, it is equally important for market managers to know the actions of competitors in a timely manner so as to know their strategies.
In the literature on market competition, one consensus is to use market segmentation and category management strategies to differentiate market competition [37,38].
However, it has been shown that a firm may not recognize a potential competitor even if its action appears obvious in the market [39]. This is primarily because most companies rely on incomplete data, such as market research [7], to evaluate changes in a market. Particularly, such data may lead to a "managerial myopia" [40] in identifying competitive relationships [8].
There is a wealth of research in the literature on analyzing competition in the marketplace, and studies related to "observable events" in the marketplace can be briefly categorized as competitor identification, competitive positioning, dynamic competitive monitoring based on resources and capabilities, and product competitiveness analysis. Some examples of typical studies are listed in Table 1. We can briefly summarize the following characteristics of the work on market competition. Firstly, managerial myopia in identifying competitive threats is a well-recognized phenomenon. Secondly, broad competitor identification is an increasingly important task for managers, but there is no particularly efficient solution. Finally, online data have attracted the attention of managers, but the literature on its application to market competition is still scarce. Managers must be mindful to incorporate new information proactively from many sources and to actively discard old, automatic maps in order to develop reliable maps for changing environments [11][12][13][49][50][51] Dynamic competition identification; resource-based competition Concept model Develop a market-based and resource-based framework for broad competitor identification [14][15][16][52][53][54] Product competitive advantage analysis Data mining Identify competitors from UGC In addition to analyzing competitors from the perspective of market strategy, in this study, we believe that a critical step in analyzing market competition is to acquire and analyze relevant data that can effectively facilitate scanning the activities of firms. Now that firm-generated content on social media platforms provides us with rich data to observe the behavior of the enterprises [8], the tasks in analyzing market competition, therefore, are to explore the events of competitive firms from these data and to analyze the market competition.

Research Problem Statement
It is generally believed that competition among firms is characterized by negative interdependence [55]; that is, one side attaining its goal will decrease the probability of the other side successfully attaining it [56]. Accordingly, in this study, the competitors in a market are referred to as a set of firms that share similar actions/activities [4,6]. Therefore, the basic problem we want to address is to identify the behavioral events of firms from textual data and measure the similarity (or relevance) of their behavior.
Suppose that there are L firms F = {F 1 , F 2 , . . . , F l , . . . , F L } in a competitive market, and, at the observation time t ∈ [1, T], firm F ∈ F published a piece of media b F t (such as online news, a blog, a report, etc.) about a business event via social media. Here, b F t can be deemed as a combination of a series of terms: where w s represents a semantic term in b F t . Furthermore, we assume that there are K types of business events, E = {E 1 , E 2 , . . . , E k , . . . , E K }, and each event E ∈ E can be triggered by a set of featured terms: Given two enterprises F i and F j in the market F, if they simultaneously conduct the same business event of E ∈ E, then we believe that they have a competitive relationship regarding business event E. This competitive relationship often occurs when the two companies' target markets are the same [57]. In this way, the task of exploring market competition is then transformed to determine whether b F i t and b F j t contain the same business event, E. To this end, the research question can be specified as the following two tasks based on business event analysis from massive firm-generated content:

•
Obtaining the complete representation of E; that is, obtaining the total triggers (features) for each event E ∈ E; • Determining the semantic similarity between b In this work, we propose a feature annotation and enhancement (FAE) method to address the first task and then deal with the second task by determining the relevance of the reported business events between any two companies in the market. Figure 1 presents an overview of the FAE system for exploring business events from massive firm-generated content in social media.

FAE Method: Extracting Business Event
In Figure 1, the left path is the learning process, which is used to obtain the formation of a collection of business events. The middle path is the word-embedding process. It is used to represent the words in the corpus in an easy-to-calculate form, for example, word vectors [58]. The right path is the inference process. Its main function is to retrieve information on business events in firm-generated content, based on which we can further explore the competition in a market.  Figure 1 presents an overview of the FAE system for exploring business events from massive firm-generated content in social media. In Figure 1, the left path is the learning process, which is used to obtain the formation of a collection of business events. The middle path is the word-embedding process. It is used to represent the words in the corpus in an easy-to-calculate form, for example, word vectors [58]. The right path is the inference process. Its main function is to retrieve information on business events in firm-generated content, based on which we can further explore the competition in a market.

Word-Embedding Process
The word-embedding training method in NLP is used to find word representations that are useful for predicting the surrounding words in a sentence or document. Mathematically, word embedding is a process to obtain a vector representation of for word in the corpus: where is the j-th dimension value of the vector for a word , and D denotes the dimension of the trained vector. The benefit of word vector representation is that when we calculate the semantic relationship between any two words and in the corpus, it can be equivalently performed on the corresponding and .

Learning Process
In this subsection, we propose a semi-supervised approach to learn the formation (i.e., trigger set) of each business event ∈ ., which consists of three parts: annotating seed microblogs, extracting seed triggers, and enriching event triggers.

Word-Embedding Process
The word-embedding training method in NLP is used to find word representations that are useful for predicting the surrounding words in a sentence or document. Mathematically, word embedding is a process to obtain a vector representation of vec(w i ) for word w i in the corpus: where w ij is the j-th dimension value of the vector for a word w i , and D denotes the dimension of the trained vector. The benefit of word vector representation is that when we calculate the semantic relationship between any two words w i and w j in the corpus, it can be equivalently performed on the corresponding vec(w i ) and vec w j .

Learning Process
In this subsection, we propose a semi-supervised approach to learn the formation (i.e., trigger set) of each business event E ∈ E, which consists of three parts: annotating seed microblogs, extracting seed triggers, and enriching event triggers.
In the process of annotating seed microblogs, we invited some experts to identify K types of business events in the market and annotate a fixed number of microblog posts as the seed texts for each event E k , k ∈ [1, K]. Such a set of microblog text is denoted by SeedBlog(E k ).
In online documents, an event trigger is a key term that most clearly expresses the exact meaning of a business event [59], and most of these key terms are verbs or nominalizations [60]. Accordingly, in the process of extracting seed triggers, we segmented all of the microblogs in SeedBlog(E k ) into terms and then focused on the verbal terms in SeedBlog(E k ) [14]. Along this line, we sorted the terms in SeedBlog(E k ), and asked experts to evaluate the verbs based on their contributions to event E k . As such, we can sift out a set of initial seed triggers from SeedBlog(E k ) for E k .
Based on the annotation results (i.e., the event E k and its initial seed triggers), we can further implement the part of enriching event triggers. To this end, we first obtained the word vector for each initial trigger v ∈ E k and then used the following method to represent the vector of E k : where t f v k j and vec v k j denote the term frequency and the vector representation of v k j , respectively.
Next, to diminish the biased nature of the experts' annotations, we also considered the selection of new words from outside the experts' annotations to enhance the features in the trigger. For this purpose, we considered that the words that can help characterize the event E k can be those that have a strong semantic similarity with E k (measured by COS) or those that interact strongly with the seed words in E k (measured by PMI). Given a candidate word w, we can calculate the similarity between w ∈ B(w / ∈ E k ) and E k based on the following semantic-contextual similarity: represents the semantic similarity between w and E k , in which vec(w) and vec(E k ) denote the vector representation of term w and behavior measures the co-occurrence relationship (contextual similarity) between in w and existing triggers in E k , in which is a smoothing coefficient. According to (5), if the value of SI M(w, E k ) is very high, then w is likely to be a new trigger for E k . In this way, the trigger set of E k is enriched to E k {w}.

Inference Process
Given a microblog text b t , its vector representation can be specified as: where t f w j denotes the term frequency of w j . Accordingly, we can introduce a function f (b t , E k ) to predict whether an event E k was reported in b t or not by calculating the similarity between vec(b t ) and E k . Therefore, b t can be considered to be a piece of host text for an event E k * if the following relationship holds: Here, the parameter µ is a predefined threshold. If the value of f (b t , E k ) is lower than the threshold of µ, we label b t as "NULL", which means there is no event contained in b t ; otherwise, b t will be labeled with the k * -th event in accordance with the maximum value of f (b t , E k ).

Method for Exploring Market Competition
To explore competition among companies in the market in the time period of t ∈ [1, T], we collected all the content generated by F j ∈ F, and then introduced the FAE method to identify the event information in each of the content. As a result, the event information identified from the contents published by all companies in the market can be represented along the timeline t as a firm-event matrix, as shown below: where the variable ψ E k (j, t) in Equation (8) denotes the occurrences of a given business event E k in firm F j at time t.
In the market, the sequence of an event E k conducted by a firm F j can be described as a . Then, the event-based competition between firms F i and F j can be measured by their Pearson correlation coefficient: Similarly, at time t, all the events conducted by firm F j can be described as a ran- Then the time-based competition between F i and F j can be specified as: Obviously, method comp E k F i , F j measures the competition between two firms regarding a business event E k , and method comp t F i , F j measures the competition between two firms at time t.
Further, along the timeline, we concatenate all the event sequences of firm F j as follows: In this way, we can propose a method to characterize the overall competition situation of F i and F j in the market as follows: In a competitive market, competition among enterprises is characterized by negative interdependence [55]; that is, one side attaining its goal will decrease the probability of the other side successfully attaining their goal [56]. Accordingly, Equation (11) can be used as a feasible metric to monitor the event-based market competition.

Data Collection
In this study, our focal market is the mobile phone production industry in China. We collected data from a leading online social media site, Weibo.com, to test the proposed method. Weibo.com is one of the most influential providers of microblogging services in China, and it is similar to Twitter, providing user services to create content and manage their accounts with access available from information devices (e.g., PCs and mobile phones). The dominance of Weibo in China's social media websites makes it a good choice for retrieving late-breaking news.
In the industry channel of Weibo.com, there are 121 mobile phone manufacturers in China that have established Weibo accounts. We crawled all of the microblog documents (denoted as "Weibo") posted by these manufacturers. Each crawled Weibo mainly contains the following information: account ID, authentication information of the firm, time, data source, Weibo text (main information), reading number, etc. Note that Weibos that contain only picture(s) or only forward link(s) were discarded as invalid messages. Overall, a total number of 436,310 Weibos were crawled online for the study.
Following the general method in the literature of NLP [61], we truncated the Weibo texts by the punctuations in sentences and then adopted a Chinese NLP tool named Jieba (https://github.com/fxsjy/jieba, accessed on 20 January 2020) to execute the tasks of word segmentation and POS tagging simultaneously.
Some statistics about the dataset are summarized in Table 2.

Business Events in Firm-Generated Content
Firstly, two experts in the field of business/marketing management were introduced to label the events in the Weibos. The annotation results were mainly based on the experts' understanding of market management, the verbal term(s) of a Weibo, and consistency with previous research results [33,62].
Experts labeled five main business events as Recruiting, Cooperation, Research, Promotion, and Sale in our dataset. In order to achieve a balanced distribution of data, the experts annotated 600 seed Weibos for each of the five events. Note that we also asked experts to label 600 NULL event data so that the Weibos containing business events can be significantly distinguished from those Weibos without any event information. Some sample triggers extracted for each event are given in Table 3. Apparently, the sample verbs gathered under the same event are semantically similar. Next, we carried out trigger enrichment to obtain a better set of event triggers. The calculation results are shown in Table 4 below.

Performance of FAE
This section evaluates the performance of our model in detecting business events from massive firm-generated content in online social media. In the experiments, the FAE method requires setting two parameters. The hyperparameter in Equation (5) is set to λ = 1; the threshold in Equation (7) is set to µ = 0.7. The other parameters are set to the default values of the Python experimental environment.
The following experiments are based on two basic aspects: firstly, FAE is a featureextraction-oriented approach, so comparative experiments were carried out between some classical feature representations in the NLP literature and FAE. These methods for feature representation are uni-, bi-and tri-grams [63], TF-IDF [61,64], and the state-of-art method BERT [65]. Secondly, a statistical exploration experiment found that the distribution of different types of business events in our dataset is uneven (see Figure 2). The latest research shows that the random forest (RF) algorithm performs well in the text classification task in unbalanced datasets [66], so we used RF as the classification algorithm in the experiment. In addition, in order to compare the efficiency of different algorithms, we invited three experts to label an additional 3600 tweets as a test set to evaluate the different models, and the differences between experts were resolved by voting.
Next, we carried out trigger enrichment to obtain a better set of event triggers. The calculation results are shown in Table 4 below.

Performance of FAE
This section evaluates the performance of our model in detecting business events from massive firm-generated content in online social media. In the experiments, the FAE method requires setting two parameters. The hyperparameter in Equation (5) is set to λ = 1; the threshold in Equation (7) is set to µ = 0.7. The other parameters are set to the default values of the Python experimental environment.
The following experiments are based on two basic aspects: firstly, FAE is a featureextraction-oriented approach, so comparative experiments were carried out between some classical feature representations in the NLP literature and FAE. These methods for feature representation are uni-, bi-and tri-grams [63], TF-IDF [61,64], and the state-of-art method BERT [65]. Secondly, a statistical exploration experiment found that the distribution of different types of business events in our dataset is uneven (see Figure 2). The latest research shows that the random forest (RF) algorithm performs well in the text classification task in unbalanced datasets [66], so we used RF as the classification algorithm in the experiment. In addition, in order to compare the efficiency of different algorithms, we invited three experts to label an additional 3600 tweets as a test set to evaluate the different models, and the differences between experts were resolved by voting. The feature settings regarding the comparison methods are defined as follows: (1) an n-gram is a sequence of n adjacent elements from a string of tokens, which typically are Chinese words or characters; (2) all the methods (except BERT) firstly represent the terms The feature settings regarding the comparison methods are defined as follows: (1) an n-gram is a sequence of n adjacent elements from a string of tokens, which typically are Chinese words or characters; (2) all the methods (except BERT) firstly represent the terms in the corpus into a high-dimensional feature space and then use feature selection [67] to select 4011 features; (3) the focus of the comparison in the experiments is on the representation methods of the text, so BERT does not consider the fine-tuning approach.
In addition, we adopted the three measures of precision, recall and F-value, which are widely adopted in the NLP literature, to evaluate the performance of the different representation methods in detecting business events:

Precision =
Number of correctly identified events Total number of identified events (12) Recall = Number of correctly identified events Number of true events labeled by experts (13) and Using the aforementioned representations as input in the RF algorithm, the performance of different representations is shown in Figure 3. in the corpus into a high-dimensional feature space and then use feature selection [67 select 4011 features; (3) the focus of the comparison in the experiments is on the repres tation methods of the text, so BERT does not consider the fine-tuning approach.
In addition, we adopted the three measures of precision, recall and , wh are widely adopted in the NLP literature, to evaluate the performance of the differ representation methods in detecting business events: Using the aforementioned representations as input in the RF algorithm, the per mance of different representations is shown in Figure 3.  Figure 3 illustrates the average performance of all feature representation methods our text dataset. Some general results can be summarized as follows. First, our semipervised method of FAE outperformed the comparison methods when identifying b ness events from textual social media. Second, the tri-gram representation method h the worst performance in the experiment. What is more surprising is the results of BE although its performance is not bad, it is still not as effective as our method.

Overall Competitive Landscape in the Market
Because the operating time span of the company's official Weibo account is not same, we selected firms that had published more than 1000 Weibos from January 2018 June 2019 (the time period for exploring competition) to carry out the experiment. Th were 44 Weibo accounts that met this requirement. We calculated the correlation of th  Figure 3 illustrates the average performance of all feature representation methods on our text dataset. Some general results can be summarized as follows. First, our semisupervised method of FAE outperformed the comparison methods when identifying business events from textual social media. Second, the tri-gram representation method had the worst performance in the experiment. What is more surprising is the results of BERT; although its performance is not bad, it is still not as effective as our method.

Overall Competitive Landscape in the Market
Because the operating time span of the company's official Weibo account is not the same, we selected firms that had published more than 1000 Weibos from January 2018 to June 2019 (the time period for exploring competition) to carry out the experiment. There were 44 Weibo accounts that met this requirement. We calculated the correlation of these 44 firms as follows. We first constructed a complete match on these 44 firms (each company must match with the remaining firms); then, we calculated the correlation coefficient of the business events of the two companies in the same pair. The calculation results are presented in Figure 4, where the values in color indicate that the correlation between business events between the two companies is above 0.7. Specifically, the values in the red, yellow and green backgrounds are in the intervals of (0.9, 1], (0.8, 0.9], (0.7, 0.8], respectively. 44 firms as follows. We first constructed a complete match on these 44 firms (each company must match with the remaining firms); then, we calculated the correlation coefficient of the business events of the two companies in the same pair. The calculation results are presented in Figure 4, where the values in color indicate that the correlation between business events between the two companies is above 0.7. Specifically, the values in the red, yellow and green backgrounds are in the intervals of (0.9, 1], (0.8, 0.9], (0.7, 0.8], respectively. We can see from Figure 4 that, in the targeted time interval, the competitive relationship between enterprises is still very fierce. After pairing the 44 firms, it can be seen that more than 25.7% of the paired relationships have a high correlation value (greater than 0.7) of their business events. Among them, the competitive behavior of the firm "IVVI" is quite incredible, and its business events have a very high correlation (greater than 0.7) with 48% of other firms' events. This is followed by KuNuo and AiJieMO; their business events are also highly related to the business events of more than 40% of other firms. On the contrary, social media information released by XiaoLaJiao suggests it is the least willing to compete in the market. The business events it expresses in the market are extremely low (less than 0.3) in comparison to 50% of other businesses' events. The company that followed XiaoLaJiao was ZUK, whose business events were unrelated to 45% of the other firms' events

Exploring Behavior-Based Competition
In order to better observe the micro-competitive relationship between firms, we selected 9 companies with the top market share among China's mobile handset manufacturers (i.e., Huawei, Xiaomi, Meizu, ZTE, Lenovo, Coolpad, OPPO, Vivo, and HTC) to show their business behaviors as well as their competition relationships on the market. The results are shown in Figure 5  We can see from Figure 4 that, in the targeted time interval, the competitive relationship between enterprises is still very fierce. After pairing the 44 firms, it can be seen that more than 25.7% of the paired relationships have a high correlation value (greater than 0.7) of their business events. Among them, the competitive behavior of the firm "IVVI" is quite incredible, and its business events have a very high correlation (greater than 0.7) with 48% of other firms' events. This is followed by KuNuo and AiJieMO; their business events are also highly related to the business events of more than 40% of other firms. On the contrary, social media information released by XiaoLaJiao suggests it is the least willing to compete in the market. The business events it expresses in the market are extremely low (less than 0.3) in comparison to 50% of other businesses' events. The company that followed XiaoLaJiao was ZUK, whose business events were unrelated to 45% of the other firms' events.

Exploring Behavior-Based Competition
In order to better observe the micro-competitive relationship between firms, we selected 9 companies with the top market share among China's mobile handset manufacturers (i.e., Huawei, Xiaomi, Meizu, ZTE, Lenovo, Coolpad, OPPO, Vivo, and HTC) to show their business behaviors as well as their competition relationships on the market. The results are shown in Figure 5 below.
The results in Figure 5 illustrate the proportion of the same action taken in different companies in the market. In each sub-figure, the value of the y-axis indicates the proportion of a particular event that happened in a company. Therefore, the larger the value on the y-axis, the greater the event of the corresponding company.  The results in Figure 5 illustrate the proportion of the same action taken in different companies in the market. In each sub-figure, the value of the y-axis indicates the proportion of a particular event that happened in a company. Therefore, the larger the value on the y-axis, the greater the event of the corresponding company.
Generally, if different firms adopt similar strategies, it can be expected that their behavior in the market should also be similar. The evolution lines in Figure 5a,b are relatively dense and show little variation, which indicates that the firms competed fiercely in the two behaviors of Promotion and Sale. This finding is interesting for the results of Research and Cooperation events: since mobile phone manufacturing is considered a high-tech industry in China, for example, the Huawei Company is likely to show the public that it is very active at Research and Cooperation events. Generally, if different firms adopt similar strategies, it can be expected that their behavior in the market should also be similar. The evolution lines in Figure 5a,b are relatively dense and show little variation, which indicates that the firms competed fiercely in the two behaviors of Promotion and Sale. This finding is interesting for the results of Research and Cooperation events: since mobile phone manufacturing is considered a hightech industry in China, for example, the Huawei Company is likely to show the public that it is very active at Research and Cooperation events. Figure 5d also highlights the fact that Huawei and ZTE had some confrontations when implementing their Cooperation events. As seen in the figure, the trend and direction of their Cooperation events moved in almost the opposite direction during the same time period (see Figure 5d; the blue line indicates Huawei events, and the green line indicates ZTE events).
In Figure 5b, the competition related to the variable Sale is more complicated. Overall, OPPO and Meizu have a strong preference for using the competitive behavior of Sale. Interestingly, Vivo, Lenovo, and Xiaomi also preferred the Sale strategy in the second half of 2018. HTC and Lenovo gradually reduced the proportion of Sale behavior after November 2018. In contrast, Meizu continuously strengthened its Sale behavior after January 2019. Figure 6 illustrates the evolving chain of the selected 9 companies on the 5 business behaviors (see Table 4). The observed time interval is from January 2018 to May 2019, with a total of 17 (continuous) time slices. In each sub-figure, the value on the y-axis indicates the strength of the corresponding event at a time slice, and the value on the x-axis indicates the persistence of the evolution of an event.

Exploring Time-Based Competition
J. Theor. Appl. Electron. Commer. Res. 2023, 17, FOR PEER REVIEW 13 Figure 5d also highlights the fact that Huawei and ZTE had some confrontations when implementing their Cooperation events. As seen in the figure, the trend and direction of their Cooperation events moved in almost the opposite direction during the same time period (see Figure 5d; the blue line indicates Huawei events, and the green line indicates ZTE events).
In Figure 5b, the competition related to the variable Sale is more complicated. Overall, OPPO and Meizu have a strong preference for using the competitive behavior of Sale. Interestingly, Vivo, Lenovo, and Xiaomi also preferred the Sale strategy in the second half of 2018. HTC and Lenovo gradually reduced the proportion of Sale behavior after 11-2018. In contrast, Meizu continuously strengthened its Sale behavior after 01-2019. Figure 6 illustrates the evolving chain of the selected 9 companies on the 5 business behaviors (see Table 4).  On the basis of the analysis of the events of these companies and their evolution chains, we identify some "strategic groups" [68]. First of all, with the exception of Huawei and ZET, the most positive business event for companies is Promotion. This shows that firms have attached great importance to the role of Weibo, a social media platform, in promotion. Second, the event lines of ZTE, Huawei, and Xiaomi were more concentrated, indicating that they were relatively balanced when implementing these five business events. Third, Huawei and ZTE were more willing to show strong and continuous Cooperation and Research behaviors in the market. Interestingly, Coolpad was quiet in the market for the time slices of January 2019 to March 2019. HTC suddenly became quite active in the market after September 2018, which was a remarkable phenomenon for a marketer.

Exploring Time-Based Competition
From the perspective of market competition, in addition to ZET and Huawei, the remaining firms used Promotion as the main event published on social media. In particular, Lenovo, CoolPad, Vivo, Xiaomi, and Meizu have been in strong competition in terms of market promotion.

Generalizability of the Study
We were also concerned with the generalizability of this study. We addressed this issue from the following two aspects. First, this paper presented a complete solution to the problem of organizational event detection from massive amounts of short texts (see Figure 1). The solution consisted of several steps, and in each step, we presented or adopted some ad hoc or heuristic techniques. Technically, these methods can be commonly used in NLP tasks. Second, we expanded our approach to another dataset, in which 48,595 abstracts of business articles were crawled from a website.
In particular, to impose our method on the new dataset, at first, experts labeled 3800 abstracts as the seed texts, and we annotated four types of business events, Marketing, Financing, Mergers and Acquisitions (MA), and Stock Market Performance (SMP). Then, we used the generated events to monitor the evolution of business events between January 2010 and February 2018 (in the Chinese market) for four Internet companies, namely Alibaba, JD, Amazon (China), and eBay (China). The results are shown in Figure 7a-d. It is particularly evident that the main marketing actions of the four companies listed, as major e-commerce platforms, are focused on marketing events. Furthermore, the competition events of Marketing, Financing, MA, and SMP are shown in Figure 8a-c, respectively. As can be seen in Figure 8a, Marketing events in the market were adopted primarily by Amazon, followed by Alibaba and JD. The Marketing events of Alibaba, JD, and Amazon had different trends during the same time period, which indicated that these three companies were trying to avoid positive confrontation. In addition, it is worth mentioning that eBay is far less active in the market for all events than the three aforementioned companies. This is also in line with the fact that eBay has faded out of the Chinese market due to its poor performance during the time period presented in the data.
In conclusion, these experimental results indicated that the proposed method, i.e., FAE, in this paper works well on the new dataset.

Research Implications
The value of the proposed method in this work lies in the use of efficient text mining methods to obtain the appropriate representation of a firm's events from massive data and in bridging the gap between massive social media and latent market information for decision-making, in particular in the marketing area. For example, managers can develop optimal marketing strategies based on the extracted events implemented by competitors.
Thus far, studies for exploring the business value of social media have been focused mainly on using intelligence methods for brand/product marketing [21,27], customer analysis [9], and their mixture [10]. Researchers, however, have also realized that MI based on social media analysis provides "competitive intelligence" by helping businesses understand their environments, competitors, and overall business trends [69]. Unfortunately, no definitive research results tell us how these wonderful things can happen. This work proposes a method to fill the application gap by revealing competitive information from massive firm-generated content in online social media.
Thus, the contribution of this paper is to provide intelligent methodological ideas on the application of large amounts of social data to management practices. In fact, for not only the competition problem but any behavioral correlation between firms that can be found in the data, the approach proposed in this paper can provide ideas for its analysis and discussion.
The prior literature is rich in studies of firm competition theory, and these studies have improved our understanding of how firms compete with each other in the marketplace. However, competition among firms in the market is very complex and usually changes and evolves over time, regions, user groups, etc. This leads to a tendency to bias (misidentification) and omission (ignoring competitors) in identifying firms' competitive relationships from traditional managerial dimensions, and knowledge about inter-firm competitive relationships is limited or even ambiguous if not analyzed from the perspective of firms' market actions (events). The research in this paper fills this gap by refining firms' market actions and constructing associations between these actions and competition. market promotion.

Generalizability of the Study
We were also concerned with the generalizability of this study. We addressed this issue from the following two aspects. First, this paper presented a complete solution to the problem of organizational event detection from massive amounts of short texts (see Figure 1). The solution consisted of several steps, and in each step, we presented or adopted some ad hoc or heuristic techniques. Technically, these methods can be commonly used in NLP tasks. Second, we expanded our approach to another dataset, in which 48,595 abstracts of business articles were crawled from a website.
In particular, to impose our method on the new dataset, at first, experts labeled 3800 abstracts as the seed texts, and we annotated four types of business events, Marketing, Financing, Mergers and Acquisitions (MA), and Stock Market Performance (SMP). Then, we used the generated events to monitor the evolution of business events between 01-2010 and 02-2018 (in the Chinese market) for four Internet companies, namely Alibaba, JD, Amazon (China), and eBay (China). The results are shown in Figure 7a-d. It is particularly evident that the main marketing actions of the four companies listed, as major e-commerce platforms, are focused on marketing events. Furthermore, the competition events of Marketing, Financing, MA, and SMP are shown in Figure 8a-c, respectively. As can be seen in Figure 8a, Marketing events in the market were adopted primarily by Amazon, followed by Alibaba and JD. The Marketing events of Alibaba, JD, and Amazon had different trends during the same time period, which indicated that these three companies were trying to avoid positive confrontation. In addition, it is worth mentioning that eBay is far less active in the market for all events than the three aforementioned companies. This is also in line with the fact that eBay has events of Alibaba, JD, and Amazon had different trends during the same time period, which indicated that these three companies were trying to avoid positive confrontation. In addition, it is worth mentioning that eBay is far less active in the market for all events than the three aforementioned companies. This is also in line with the fact that eBay has faded out of the Chinese market due to its poor performance during the time period presented in the data.

Managerial Implications
On the basis of our proposed FAE method, we effectively extracted the business events from massive firm-generated contents, which is useful information for marketers. The foremost application of our work is the exploration of competitive behavior through open social media, which enables marketers to summarize their competitors' online microblogs with salient competitive events. In particular, the method proposed in this work can be used to discover and visualize the temporal evolution of business behaviors as they overlap or can be discriminated from those of competitors. As such, marketers can further analyze the motivations for companies to adopt the same (or different) strategies in the same period with the help of organizational behavior theories. In this way, this framework can enable marketers to effectively represent the trend of corporate-industry behavior to avoid potential decision-making risks.
In addition to identifying firms' events, our approach also harvests rich event triggers. This information can be used by investors to subdivide the (semantic) differences at the event level when different companies follow the same behavior. In particular, by mining the evolution information of firms' events over a long period of time, investors can better identify major changes in the management of the business as well as false behaviors in the market. For example, some traditional enterprises may have exhibited unreasonable investment behaviors in the Internet industry during the Internet boom.

Limitations
We are aware of several limitations of this study. First, the selection bias between experts may have resulted in different annotations on the same seed microblogs. To reduce the impact of this difference on the final result, we should be more precise in the steps of labeling, such as improving the capabilities of experts, and we should design strategies to cross-validate the results of these different experts. Second, it would be interesting to extend the event detection method to characterize firm competition. We believe, however, that future research of theoretical analysis will be key to understanding why firms undertake the actions that we observed and how those actions may have affected their performance.
In addition, if the corpus is obtained from a limited number of websites, the method proposed in this paper may be faced with the problem of information completeness for identifying a firm's events. Future work should benefit from an enhanced corpus to address this limitation.
Finally, there is no theoretical depth in the discussion of corporate market competition. The initial idea of this paper is to use intelligent methods (big data processing) and new data sources (social media) to explore the competitive behavioral relationships between firms. However, while the "event" information from the large amount of data analyzed provides a clue for analysis, it requires further theoretical discussion as to whether it can truly account for the competitive relationships that exist between firms. This is the part of our future work that needs to be strengthened.

Conclusions
Social media data contain valuable information that can be used in decision-making. The challenges of a lack of definition, mixed topics, and various types of documents, however, have prevented the efficient extraction of business events from massive amounts of social media data.
To address these challenges, theoretically, in contrast to the traditional view of analyzing firms' competitive relationships based on market segmentation and strategy, we propose that firms' competitive market strategies may be continuously planned, but at the implementation level, they can only be reflected by their targeted "events" in the marketplace. Moreover, from a practical point of view, we demonstrate that the online social media content generated by firms is timely data reflecting their "behavior" in the marketplace. The proposed method has two main parts. The first involves crawling textual data online, labeling event types, and enriching event triggers. It results in a set of extracted events as well as their triggers, which can then be used as a classifier to identify event information from texts. The second mainly includes identifying business event information from massive firm-generated content and exploring the potential competitive relationship between any two enterprises in the market by organizing their extracted business events into a firm-event matrix. The experimental results with real data extracted from Weibo.com show that the method proposed in this work can efficiently extract business event information from social media, and such information may enable marketers to summarize their competitors' online microblogs with salient competitive events. Informed Consent Statement: Informed consent was obtained from all subjects involved in the study. Data Availability Statement: Not applicable.