Contextual polarity and influence mining in online social networks

Crowdsourcing is an emerging tool for collaboration and innovation platforms. Recently, crowdsourcing platforms have become a vital tool for firms to generate new ideas, especially large firms such as Dell, Microsoft, and Starbucks, Crowdsourcing provides firms with multiple advantages, notably, rapid solutions, cost savings, and a variety of novel ideas that represent the diversity inherent within a crowd. The literature on crowdsourcing is limited to empirical evidence of the advantage of crowdsourcing for businesses as an innovation strategy. In this study, Starbucks’ crowdsourcing platform, Ideas Starbucks, is examined, with three objectives: first, to determine crowdsourcing participants’ perception of the company by crowdsourcing participants when generating ideas on the platform. The second objective is to map users into a community structure to identify those more likely to produce ideas; the most promising users are grouped into the communities more likely to generate the best ideas. The third is to study the relationship between the users’ ideas’ sentiment scores and the frequency of discussions among crowdsourcing users. The results indicate that sentiment and emotion scores can be used to visualize the social interaction narrative over time. They also suggest that the fast greedy algorithm is the one best suited for community structure with a modularity on agreeable ideas of 0.53 and 8 significant communities using sentiment scores as edge weights. For disagreeable ideas, the modularity is 0.47 with 8 significant communities without edge weights. There is also a statistically significant quadratic relationship between the sentiments scores and the number of conversations between users.


Problem description
Crowdsourcing platforms have emerged as a method of outsourcing work to consumers not directly affiliated with the firm [8]. These platforms possess the capacity to overcome limitations in traditional business research, such as small participant sample sizes and narrow participant demographic backgrounds [9]. While the benefits of crowdsourcing platforms are salient to many businesses and academics, the research on methodological approaches to extract robust data from these platforms is somewhat polarized between qualitative and quantitative researchers. There is a salient paucity in business research on crowdsourcing that presents the benefits of multiple methodological approaches to effectively extract this data to benefit a firm's product differentiation strategy.
While generalized linear mixed models and social network analysis have become common methodological approaches in crowdsourcing research, some scholars fail to consider other methods, such as text mining and sentiment analysis, as robust methodological approaches. Both text mining and sentiment analysis are rooted in the disciplines of sociology, anthropology, and psychology. These methodological approaches are manifestations of theories of affective stance and appraisal, which focus on emotions shaping cognition [10]. The main objective of this study is to examine and explicate how companies can benefit from crowdsourcing platforms by using a multitude of empirical methods, such as text mining, sentiment analysis, social network analysis, and generalized linear mixed models, to generate new product ideas.

Research objectives
The research presented here has three primary objectives. The first is to examine the users' perception of the company. Viable ideas stemming from crowdsourcing initiatives are typically influenced from a user's experience or perception of the firm. When a user's experience or perception of a firm is positive, ideas tend to be more constructive and feasibly implementable [7]. The second objective is to identify which communities of users generate the best ideas and which communities generate the worst ideas. Crowdsourcing platforms allow users to either promote, through "likes", or demote, through "dislikes", the ideas of their peers [7]. Therefore, communities of users with the largest number of likes are deemed to have the best ideas, while those with the largest number of dislikes are deemed communities that generate the worst ideas. Poorly performing communities can be isolated so that greater attention is paid to better ideas, accelerating the evaluation. The third objective is to extrapolate the sentiment of discussions between two different groups of crowdsourcing users based on the frequency of their conversations. The frequency of word usage is a common method to characterize the type and degree of sentiment of users. Similarly, the frequency of interactions among users can help researchers determine both type and degree of sentiment among crowdsourcing platform users [11].
These objectives were addressed with three empirical frameworks. In the first, sentiment analysis was used to calculate and categorize sentiment and emotion scores for successful and unsuccessful ideas to show that the platform's users had an accurate impression of the company. In the second framework, the sentiments and emotions were used to construct the communities using social network analysis. The communities were linked by users identified as idea launchers. The third framework was constructed to show the relationship between sentiment and the number of messages exchanged by users. Agreeable ideas are the top 5% while disagreeable ideas are the lowest 5%. The relationship was positive for agreeable ideas and slightly less so for disagreeable ideas.

Crowdsourcing
The concept of harvesting ideas from the crowd has been around for centuries, including the longitude prize, offered by the British government in 1714 to anyone who could solve the practical problem of measuring longitude at sea [12]. Both the physical and digital worlds are connected, and data are readily available to firms beyond their immediate workforce. Now, firms possess the ability to reach out to the masses for ideas that can be commercialized [13]. Web 2.0 technologies have radically changed the way people communicate on the Internet. Crowdsourcing is increasingly performed via the Internet, enabling a plurality of contributors from around the globe to harvest ideas [14].
With these technologies, several terms were invented, among them, crowdsourcing, which was used for the first time by an anonymous user on an Internet forum 10 years ago [15]. The term was popularized by journalist Jeff Howe in 2006, in his article published in the online magazine Wired [16]. Howe [17] defined crowdsourcing as an act of outsourcing a task to a large and undefined group of people, coordinating the knowledge of the group with those who need it. Howe wondered if many solutions to our problems were already present in the wisdom of the crowd, just waiting to be uncovered. Later, Estellés-Arolaset al. [18] created a more comprehensive definition that conceptualized crowdsourcing as a participative online activity, in which individuals or firms propose a voluntary task to a group of individuals with varying knowledge bases, demographic heterogeneity, and size. The resulting process is mutually beneficial as users receive satisfaction from a given need, such as economic, social recognition, self-esteem, or skill development. The crowdsourcer obtains work, money, knowledge, experience, and other advantages from the crowd.
Hosseini et al. [19] describe the taxonomy of features that characterize crowdsourcing into its four constituent parts, which are the crowd, the crowdsourcer, the crowdsourced task, and the crowdsourcing platform. The crowd consists of the people that participate in crowdsourcing activities. Crowds are characterized by diversity, anonymity, size, undefinedness, and suitability. A crowdsourcer is an individual, an institution, or a firm that seeks the inherent power in crowds to complete a specific task. Often, incentive provisions, open calls, ethicality provisions, and privacy provisions are put into place by crowdsourcers to create parameters of conduct to elicit useful ideas. The crowdsourced task is the outsourced activity that is provided by the crowdsourcer. Often, it is in the form of a problem, an innovative model, a data collection issue, or a form of fundraising. A crowdsourced task requires the expertise, experience, ideas, knowledge, skills, technologies, or money from the crowd. The features of a crowdsourced task are modularity, complexity, solvability, automation, and user-drivenness. Finally, the crowdsourcing platform is where the actual crowdsourcing task takes place. The platform is usually a website or an online venue where crowd-related interactions take place.
The literature describes many forms of crowdsourcing [20][21][22]. A general taxonomy of the modern forms of crowdsourcing was proposed by Doan et al. [23], distinguishing between explicit and implicit crowdsourcing forms. In the explicit form, companies ask for contributions directly (e.g., Ideas Starbucks, IdeaStorm, Amazon Mechanical Turk). In the implicit form, companies embed tasks to motivate users to participate. In the taxonomy created by Doan et al. [23], users' implicit participation can be categorized as standalone or can be piggybacked onto another platform. In standalone implicit crowdsourcing, companies use the input provided by the users to solve a problem that is related to the issue of the platform (e.g., the ESP game, the Peekaboom game, and reCAPTCHA). In the piggyback form, companies gather and retrieve information using third-party websites, such as search engines (e.g., Google, Yahoo, and Bing!). The content generated by users is used, for example, for product recommendation, spelling correction, and keyword generation [15,23]. There are other taxonomies offered for crowdsourcing: internal and external [15]; micro-tasks, macro-tasks, simple projects, and complex projects [24]; and integrative and selective tasks, routine tasks, complex tasks, and creative tasks [25].
Howe [17] proposed another taxonomy of crowdsourcing forms, distinguishing collective intelligence, crowd creation, crowd voting, and crowdfunding. Howe [17] conceptualizes collective intelligence as a group of individuals collaborating to create synergy. Ultimately, this synergy creates something more significant than its constituent parts. Crowd creation is the most common form of crowdsourcing and emphasizes solving a particular dilemma through satisfactory solutions, in the form of tangible products, to a specific problem. The most significant output from the crowd creation process is an end product of either intellectual or physical form that holds values to others [14,17]. Crowd voting is considered to be among the most popular forms of crowdsourcing that have the highest participation rate among all forms of crowdsourcing. It involves leveraging the community's judgment to elicit ratings of products and services. Crowdfunding is a form of alternative finance that consists of the funding of a project or business venture by raising monetary contributions from a large number of people.
Since the adoption of Web 2.0, the majority of multinational companies have established a crowdsourcing platform for their business to obtain customers' opinions and ideas, to facilitate communication both with and between the customers, and to enhance the loyalty of customers, and increase product recognition [26].

Sentiment analysis
Opinions are critical influencers of behavior and central to most human activity. Beliefs, perceptions of reality, and decisions that we make are conditioned upon how others perceive and evaluate the world around them. An integral part of the decision-making process is seeking out the opinion of others. Not only is this true of individuals, but also organizations [27]. Sentiment analysis, or opinion mining, is a social science methodology that analyzes people's opinions, sentiments, evaluations, appraisals, attitudes, and emotions toward products, services, and organizations [27,28].
Sentiment analysis began earlier in 2000 [27,29,30], but it was not until [31] used the term that it became widespread. Meanwhile, Dave et al. [32] introduced the term "opinion mining" for the same activity. As a methodological tool, sentiment analysis, or opinion mining, is a machine-learning approach that extracts opinions, sentiments, appraisals, attitudes, and emotions toward entities (e.g., products, services, and topics) and their attributes (e.g., picture quality, battery life, quality of service, and support) from a text [30]. This methodology is commonly seen as a subarea of natural language preprocessing because it contains lexical semantics, co-reference resolution, word sense disambiguation, discourse analysis, information extraction, and semantic analysis [27]. Sentiment analysis is used in a vast number of domains, including marketing, finance, health, tourism, politics, and social science.
In addition to the business applications, sentiment analysis has research applications. For example, [33] and [34] show that positive sentiment is a better predictor of movie success than simple buzz; Liu et al. [35] use sentiment to predict box-office revenue; [36] conclude that negative emotion significantly affects innovation activities in the brand community (i.e., on a crowdsourcing platform); Lee et al. [37] extract idea content characteristics, such as subjectivity and negativity, using Starbucks crowdsourcing, which indicates that the subjectivity and negativity of ideas have a positive impact on user agreement and organization adoption. Pestian et al. [38] predict the suicide of patients based on anonymized clinical tests and annotated suicide notes based on the assignment of emotions to suicide notes. Zhang et al. [39] identify positive and negative public moods on Twitter and use them to predict the movement of stock market indices. The examples of sentiment analysis are prevalent in several business and nonbusiness-related literature, indicating the prevalence of the technique among academics from various fields.
Sentiment can be classified as linguistic-based, psychology-based, and consumer research-based [40]. The latter conceptualization is selected for simplicity [27,40], classifying sentiment as rational or emotional. Rational sentiment consists of rational reasoning, tangible beliefs, and utilitarian attitudes [27,40] and can be classified as positive, negative, or neutral. Emotional sentiments are non-tangible and originate in people's psychological states of mind [27,40]. While there are many classification systems for emotions, we used the one created by [41], who proposed eight evolutionary-created emotions: anger, fear, sadness, disgust, surprise, anticipation, trust, and joy. This classification system was chosen for its parsimony and seminal place in the literature stream. Several contemporary taxonomies of sentiment analysis maintain roots in [41].
There are three different levels on which sentiment analysis can be performed depending on the study requirements: at the document level, in which each document (e.g., product review, idea, comment) is considered as basic unit of information. Entities and aspects inside the document are not studied, and sentiments expressed about them are not determined. However, document-level sentiment analysis is less meaningful because the author's opinion may be positive about some entities and negative about others, for example, "Jane has used this camera for a few months. She said that she loved it. However, my experience has not been great with the camera. The pictures are all quite dark. " [42]. At the sentence level, each sentence is treated as a short document. However, sentence level has two classes: subjective (i.e., an opinionated sentence) and objective (i.e., a notopinionated sentence). At the feature level, also called aspect level, each piece of text is identified as a feature of some product and is based on the idea that an opinion consists of a sentiment and a target. In our study we used sentence-level sentiment analysis because it allows for the removal of objective sentences that are assumed to imply or express no opinion or sentiment. Sentiment scoring is a process to calculate sentiment and polarity by matching words back to a dictionary of words flagged as "positive", "negative", or "neutral".

Social network analysis
The second objective establishes the discovery of community structure in the network of users proposing ideas. This process should be part of decision-making by stakeholders of any company. Crowdsourcing platforms allow interaction among users who are proposing ideas. These interactions can be investigated as a network using graph theory. The graphs generated by the analysis are interactive, showing the network structure and the links that interconnect the structure. Social network analysis is used to study relationships among individuals, families, households, villages, and firms. In social networks, the actors can be modeled by a network structure consisting of vertices and edges. Vertices represent actors (known as nodes), and the edges (known as ties) represent the relationships between the vertices. The strength of a connection indicates how strong a relationship is [43].
In social network analysis, a social network is conceptualized as a graph when the relationships have no direction and as a digraph in the presence of direction. In a digraph, the edges are presented as arcs. The main goal of social network analysis is to detect and interpret patterns of social ties among actors, with community structure being an important property. In the classification of nodes into groups, the within-groups connections are dense, but between-groups connections are sparse [44]. The community structure in social network analysis is closely related to clustering and graph partitioning concepts. The identification of the optimal number of communities is a difficult task that depends on the algorithms used.
A network-based perspective is frequently used to analyze the key users in crowdsourcing [45,46]. For example, Martínez-Torres [47] showed that social network analysis can be used in crowdsourcing to identify innovative users, defined as those users who post ideas that are potentially applicable to the organization [47]. Arenas-Marquez et al. [48] used social network analysis to identify influencers who can have a significant impact on the decision-making of other users based on the participation features of word-of-mouth crowdsourcing platforms [49]. Toral et al. [50] used social network analysis to identify the users who play the role of middleman among other users on a crowdsourcing platform [25]. Basole [51] stated that social network analysis can be used in the mobile ecosystem of crowdsourcing to help decision-makers to: (1) visualize the complexity of interfirm relationships and interactions among current and emerging mobile segment; (2) estimate how convergence influences ecosystem structure, and (3) evaluate the firm's position relative to its competitors. These results can then be applied to improve innovation strategies or business models.
Social network analysis is used in different disciplines, such as computational biology, where researchers study systems of interacting genes, proteins, chemical compounds, or organisms [52]. Researchers in the field of finance have used social networks to analyze the interplay among world banks as part of the global economy [53]. In marketing, researchers often assess the extent to which product adoption is induced as a type of contagion [54]. Engineering scholars can utilize social network analysis to establish best practice designs to deploy networks of sensing devices [55]. The field of neuroscience uses social network analysis to explore voltage dynamic patterns in the brain associated with epileptic seizures [56]. Political science researchers use social network analysis to examine the evolution of voting practices when groups are faced with varying internal and external forces [57]. Finally, public health scholars study the spread of infectious diseases in populations and formulate plans of action to address those infections by employing social network analysis [58,59].

Generalized least squares
The third objective examines the relationship between the ideas' sentiment scores and the discussion frequency of crowdsourcing users, for which we constructed a linear regression model. Sentiment scores calculate the text polarity sentiment at the sentence as measured by Hu et al. [60]. The parameters of the linear regression can be estimated using different approaches. The ordinary least squares method is widely used for the optimal linear unbiased estimation of the parameters of a linear regression. The validity of the inferences of a linear regression model via the ordinary least squares estimator depends on four assumptions for the residuals: zero conditional mean, independence, constant variance, and normality. Testing the assumptions is vital, and a violation can result in biased estimates of the relationships, confidence intervals, and significance tests [61]. The normality assumption is required for confidence intervals in small samples. For large samples, this assumption can be accepted because of the Central Limit Theorem. If the conditional mean of residuals is non-zero, the relationship between the outcome and the predictors is nonlinear and the regression coefficient may be biased [61]. The assumptions of independence and constant variance are known as homoscedasticity. If the assumption of homoscedasticity is violated, the estimates of standard errors, confidence intervals, and significance tests may be biased [61]. Nonlinearity is eliminated with the logarithmic transformation of variables. However, this linearization of the variables cannot eliminate the presence of assumption violations in ordinary least squares. In this study, any possible presence of non-independence of residuals and non-constant variance of residuals is eliminated by using the generalized least squares estimator [61,62] for the linear regression. The generalized least squares method has the properties of being unbiased (the difference between the estimated value and real value of the parameter converges to zero), consistent (convergence to the real value of the parameter), and asymptotically normal (the probability distribution converges to the normal distribution) [61,62]. Generalized least squares is the best linear unbiased estimator for the parameters. The parameters obtained with the generalized least squares approach fit a linear regression model [61].
When a database is too large for the available computing power, it is necessary to use the statistical technique of bootstrapping [63,64], in which a set of several random datasets resampled from the original data are adjusted with generalized least squares to find appropriate parameters. The final model estimated is calculated as the average of all simulations with its associated standard error and the statistical significance of the parameters [64]. With the existence of an exorbitant amount of unlabeled natural language data and the lack of labeled data, bootstrapping has become a common practice among scholars [65]. We chose this method because of its foundations and evidence-based efficacy in computational linguistics and business scholarship.

Data collection and description
My Starbucks Idea is an online social platform (mystarbucksidea.force.com) that serves as a crowdsource, where users propose, comment, and vote on ideas of other users. Each proposed idea on this site contains information about the author, a score based on the number of votes received, and the comments of users. Each user earns points corresponding to the idea's rating. Highly rated ideas are reviewed and may be implemented by Starbucks.
The database we use was collected from the "My Starbucks Idea" website in December 2016, offering data from 2008 to 2016 containing 17 attributes. We employed web-scraping techniques, such as R, Rvest, and XML2 packages to harvest the data. Table 1   We selected only the top 5% of agreeable ideas and the lowest 5% of disagreeable ideas. The main reason behind this approach is to verify if the same authors are active in both groups. We find that there are differences between the two groups in terms of the quantity and quality of ideas (Table 3).

Experimental setup
Each objective of this study requires a special design. For the first objective, determining the perception of the company by crowdsourcing participants when generating ideas on the platform, the database needs to be analyzed with special statistical tools. For the second objective, organizing users with community structure to locate users more likely to produce ideas, the statistical algorithm needs to be able to classify information in a way that meets this objective. The third objective, studying the relationship between the ideas' sentiment scores and the discussion frequency of crowdsourcing users, can be investigated once the classification of information is completed. The study is then split in two stages: classification and causation. The classification with community detection and popularity-sentiment approaches is performed first, then the causation part is classified using the generalized least squares technique.
The following sections present the description of the approaches used in this study. Section "Popularity sentiments" discusses popularity sentiments. A contribution of this study here is the utilization of three dictionary methods (natural language preprocessing, an augmented dictionary method, and a sentiment dictionary) to infer sentiments and emotions using their direction. Section "Community detection" presents the community-detection process. A contribution of this study in this section is the implementation of eight algorithms of modularity to obtain the best communitydetection approach. Section "Generalized least squares" presents the generalized least squares approach. Here, a contribution of this study is the utilization of a non-parametric approach in the calculation of the causality relationship between scores and discussion frequencies.

Popularity sentiments
The first objective requires the mining of opinions since online platforms that allow users to express open-ended product reviews, ideas, and comments contain a huge amount of text. Some text may contain unknown words and abbreviations. This is where text mining analysis is be used to discover the patterns, connections, and trends [50]. Sentiment analysis [27,27,28,61,66] is used to evaluate opinions, sentiments, evaluations, appraisals, attitudes, and emotions. In this research, sentiment analysis or opinion mining is used. A preprocessing text technique is required for information retrieval, information extraction, and computational linguistics research that transforms unstructured, original-format content into structured, formatted information [27,50,61]. The technique used in this study is natural language preprocessing [67]. Natural language preprocessing is composed of different tasks, including tokenization, part-of-speech tagging, syntactical parsing, and shallow parsing [27,50]. The orientation of an opinion (sentiment orientation, polarity of opinion, semantic orientation, and orientation score) is determined by dictionary methods. Those include the augmented dictionary method [68] for positive, negative, or neutral opinions and the Hu Liu sentiment dictionary for tagging polarized words in an opinion. The main advantages of a dictionary approach are to speed up processing of large datasets and to increase processing accuracy [60]. The lookup dictionary method combined with the National Research Council Canada (NRC) sentiment dictionary [69] is used to calculate frequency of emotions (anger, anticipation, disgust, fear, joy, sadness, surprise, and trust) in an opinion.
Sentiment analysis is the computational study of opinions, sentiments, and emotions expressed in text. The formula to calculate the polarity of a sentence is: Object o j is the entity target. In our case, it will be the Starbucks products. f jk is a feature representing the components and attributes of the object o j . In our case, it will be, for example, the marble cake. Sometimes, the object can be itself seen as a feature. The opinion holder is the person h i (or organization) that expresses the opinion. The polarity measure oo ijkl can be a positive, negative, or neutral value. The i, j, k, and l values are the indexes. The polarity measure in the example in Fig. 1 is based on sentence level (two sentences) and the dictionary we get of polarity measure = 0 (neutral)

Community detection
The second objective of this study requires the identification of structures in idea proposals. The structures of many phenomena can be represented as networks [49,[70][71][72]. A typical network has two components: the nodes (vertices, actors) and the collection of relationships between the nodes. Nodes can be regrouped together in clusters; community detection refers to regrouping similar nodes of the network together. Figure 2 o j , f jk , oo ijkl , h i , t l . illustrates the property of community detection, where the network is split into sets of nodes with high internal density but low external density. The inner density of nodes is defined as follows: and the external density as: where C x is the community x and n C x is the number of nodes in the community x.
The identification of the optimal number of communities is an open problem. Table 4 presents the list of eight most common algorithms for community identification [73], the performance of which was compared in this study. However, every algorithm has its own limitations depending on the network topology. Thus, a quantitative criterion is needed to assess the quality of community structures. The modularity measure, as defined by Newman et al. [44], is the function used to assess the groups of nodes in the network that interact more with each other than with the rest of the network [37,73]. The Newman and Girvan modularity measure can be written as: where for two vertices, i and j, the A ij is the adjacency matrix that indicates if pairs of nodes are adjacent or not, k i and k j are the degrees of the vertices i and j, and m = i k i /2 is the total number of edges in the network. The δ-function is defined as: The possible values of the modularity measure are between 0 and 1. Values closest to 1 indicate better community-structure quality. The modularity increases if the size of the graph increases as well as if the number of well-separated communities increases [73]. The statistical significance of a community is calculated with the Wilcoxon rank-sum  [74]. This statistical test allows verification that the internal degree of density of nodes in the community is higher than the external degree. The multilevel (Louvain) algorithm for finding a community structure is based on the multilevel optimization of modularity measure and a hierarchical clustering approach [75]. The fast greedy algorithm is based on the locally optimal choice at each stage (greedy optimization) and a hierarchical approach [76]. This algorithm is fast and well adapted for networks with large numbers of vertices and edges. The info map algorithm is based on the probability flow of a random walk trajectory [77] and minimizing the map equation [78] over possible network partitions. The edge betweenness algorithm is based on the number of the shortest paths that go through an edge in a network and a hierarchical approach [44]. This algorithm is very slow and is not recommended for very large networks. The leading eigenvector approach is based on the optimization of the modularity measure combined with a divisive approach [79]. This algorithm is not appropriate for degenerated networks. The walk trap algorithm is based on short random walks with the same principle as the fast greedy algorithm, but it is slower than the latter [80]. The label propagation algorithm [81] is a fast algorithm based on the propagation of a small subset of a priori labeled data through unlabeled points in the network. The spin glass approach is derived from statistical mechanics and based on the Potts model [82,83].

Generalized least squares
The third objective of the study established that positive sentiment between two different users increases with the number of discussions. This relationship is explained with the generalized least squares model, a linear model well adapted for variables that are not normally distributed. That is, the use of non-parametric methods. The linear model with the generalized least squares estimator instead of ordinary least squares is preferable even if the independent variables are numerical [61,62]. The dependent variable is the total sentiment score in a discussion between two users. The independent variable is the number of discussions exchanged between two different users and is transformed with the logarithmic function.

Proposed approach
Given the large size of the collected data, 5% (randomly resampled 11 times) of the ideas and their comments with the highest and lowest scores were analyzed separately. Table 3 summarizes the frequencies of the 5% (randomly resampled 11 times) of agreeable and disagreeable ideas per user and the average likes or dislikes per idea.
The polarity sentiment scores are calculated with the augmented dictionary method. The Hu Liu lexicon was extracted for each idea and annotated to determine whether the content was positive, negative, or neutral. Subjective information about the emotions was extracted with the lookup dictionary method. The NRC lexicon was used to obtain the frequency of anger, anticipation, disgust, fear, joy, sadness, surprise, and trust emotions. Figures 3 and 4 show the trajectory plots to understand and visualize how these sentiments and emotions were activated across the ideas and comments, and how the narrative is structured over time. The x-axis is time, the y-axis represents the sentiment scores or emotions frequency. A Fourier transformation and low-pass filtering were used to reveal simple shapes and remove the noise from the trajectory plots.
For the social network analysis, the frequencies of discussions exchanged between two users were calculated with the constraint of pairing users with more than two discussions. This calculation was made separately for agreeable and disagreeable ideas. The resulting datasets contained only four attributes, as described in Table 5. For the sentiment analysis, the package igraph in R was used to build the undirected network graphs and to identify the community structures. The algorithms employed are listed in Table 3. Multiple edges were combined and isolated nodes (i.e., nodes with centrality degree equal to zero) were removed. The nodes were aligned in an esthetic manner using Fruchterman-Reingold's layout based on force-directed graph drawing algorithms [84]. The adequate community structure was selected using the modularity Fig. 3 Trajectory plots of the sentiment scores of 5% of the agreeable ideas and comments added to those ideas: red-best ideas, green-best comments, black-worst ideas, blue-worst comments  measure and the Wilcoxon rank-sum test [74]. For the best modularity measure, two network graphs were created: one with sentiment scores as the edge weights and one without edge weights. To investigate the relationship between sentiment scores and discussion frequency, the generalized least squares technique was complemented with the bootstrap technique. A set of 22 simulations were constructed with 5% of the sample (randomly resampled 11 times). Eleven simulations for the best ideas and 11 simulations for the worst ideas were conducted. The records selected were randomly chosen. For each one of the simulations, the parameters of the linear regression were calculated. The final parameters of the linear regression were calculated as the average value of each coefficient. The standard error and statistical significance were calculated from the 11 simulations.

Sentiment analysis results
Sentiment analysis provides the insights for the first objective, which is to find out if crowdsourcing users have an accurate impression of the company. Having an understanding and knowledge of the emotions of users when they post ideas and comments, the company can better assess the benefit of crowdsourcing for its operations.
The trajectory plots in Fig. 3 show that the narratives of the agreeable ideas and disagreeable ideas with their comments are structured. Comments on the best ideas (green) follow the shape of the best ideas (red): the sentiment of the best ideas is correlated with that of their comments. Meanwhile, comments on the worst ideas (blue) do not follow the shape of the worst ideas (black): the worst ideas' sentiment is not correlated with their comments' sentiment. The sentiment scores of the best ideas are always above those of the worst ideas (except for the year 2014). This signals a positive involvement of users with the company. The users feel comfortable with the company. There is a parallelism of best comments with best ideas that reinforces the good perception of the company. The comments on the worst ideas present a curvature opposite of those ideas. Users are not in the mood to attack or criticize the worst ideas. They give a low score to the worst ideas, yet their sentiment scores do not signal a negative attitude. The curves corresponding to best ideas and best comments never cross paths. The situation is similar for worst ideas and worst comments. Ideas and comments are the main effects and have no interactions between their scores. This means that comments can explain both the best and worst ideas. There was a positive sentiment all along the time period between 2008 and 2016. Both the agreeable and disagreeable ideas began with the lowest sentiment scores and slightly moved up until they reached their maximum between 2010 and 2011. The positive sentiment of the language dropped down slightly until it reached a local minimum for the agreeable ideas in 2014 to 2015 and for the disagreeable ideas in 2012. After 2015, the positive language used in the agreeable ideas increased until it reached the maximum in 2016. The shape of the agreeable ideas was sinusoidal, and that of the disagreeable ideas was a parabola. The shapes of sentiment of the language used in the comments were approximately linear without trend and with some negligible movements, except in 2015 when the positive language used in the comments added to the agreeable ideas increased significantly until it reached its maximum in 2016. Figures 4 and 5 show that the trust, joy, and anticipation emotions dominate the contents of the language used in both the agreeable and disagreeable ideas and their comments. However, there were no significant movements in the trajectory plots of the emotions, except in 2015. The trust, joy, and anticipation emotions increased significantly in the agreeable ideas after 2015 until they reached their maximum in 2016. Posted ideas and comments indicated that users were mainly in a good mood. The scores of ideas posted in a good mood are always above those of the comments. Here are two examples of positive and negative sentiment scores (Figs. 6 and 7 ). In both cases, the score went up after the cleaning the text. The score was calculated with the Hu and Liu lexicon.  For the users who are in a bad mood, ideas and comments are in the same range for both the best and worst ideas or comments. The curves of positive and negative never cross paths: there are no interactions between good and bad moods. Within good moods and bad moods there are interactions throughout the study. In particular, trust and joy never interact, but anticipation interacts with both joy and trust. Users feel trust or joy towards the company, but anticipation is always present. The users have a positive perception of the company and they provide the company with good input. The number of users with bad moods and negative emotions is smaller than that of those with positive emotions.

Network analysis results
The network analysis gives an analytic graphical solution to the second objective. Among all the participants, it is possible to regroup users into communities that are more likely to generate the best ideas. At the same time, the links between and within communities can be found. There are three sets of results in the network analysis used to identify the communities of users with the best ideas. The identification of communities needs to be optimal (Tables 6 and 7 for modularity). For the best set of communities, the community structure is calculated and presented in Tables 8 and 9. The graphical interactive representation of the network is presented in Figs. 8 and 9. Table 6 shows the number of communities, the modularity measures, and the number of significant communities detected by each algorithm for the agreeable ideas. The fast greedy and multilevel algorithms have the highest values of the modularity measure with and without weights, but when the sentiment scores are not used as edge weights, the Fast Greedy 0.53 11 9 Info Map 0.50 33 7 Edge Betweenness 0.53 9 7 Leading Eigenvector 0.51 10 7 Walk Trap 0.47 35 7 Label Propagation 0.45 35 4 Spin Glass 0. 21 11 edge betweenness algorithm has the same modularity measure as the fast greedy algorithm. The Wilcoxon rank-sum tests show that the community structure based on the fast greedy algorithm has the highest number of significant communities. The top communities are significantly dense internally and significantly sparse externally, and one community is not significant by this measure (the fourth community in Table 8). Figures 8 and 9 show interactions among users in the community. The groups depicted in these figures can be detected and isolated. The graphs present the distance between communities, the spread of communities, and the location of communities. Each node has the interaction between communities and within groups. In the case of agreeable ideas, Fig. 8 shows the undirected network in which the nodes represent the users who posted ideas and/or comments, and there is an edge from user A to user B if B frequently had discussions with A. The widths of the edges indicate the frequency of discussions between users A and B, and the colors of the nodes indicate the communities found by the fast greedy algorithm with sentiment scores as edge weights. The network has a total of 254 users and 545 edges, where the positive, negative, and neutral sentiments are colored in blue, red, and grey, respectively. Most edges (88.4%) are blue, which means that the language used in discussions between users is positive. Table 8   respectively. The 7th community contained the most active users with 481 posted ideas (13.4 ideas per user on average), 11,882.9 points earned per user on average, and 7.0 connections per user on average. The 1st and 3rd communities contain new members with an average of 1678.3 days and 1285.1 days per user, respectively, but the users of the 1st community are less active than the 3rd, with two ideas per user on average, 260.8 points earned per user on average, and 4.6 connections per user on average. The 9th community has the lowest number of posted ideas and connections per user on average. The 2nd and 6th contain occasional users, and the 5th community contains the second most active users.
In the case of the disagreeable ideas, Table 7 shows that the fast greedy algorithm without sentiment scores as edge weights has the highest value of modularity measure, and that all communities are significantly dense internally and significantly sparse externally.
In the case of the disagreeable ideas, Fig. 9 shows that the language used in discussions between users is positive (81.5%). The graph contains 274 users and 647 connections. In Table 9, the 2nd and 5th communities had the least valuable ideas with dislikes per idea, on average, of 3612.7, and 3416.4, respectively, but the users of both have different characteristics; the users of the 2nd community have the highest number of connections per user and the oldest accounts with 2763 days per user on average. Meanwhile, the users of the 5th community have higher points earned per user on average (8415.8 points). The 3rd community contains the users who have the highest points earned per user on average (34,732.6 points) and the highest number of posted ideas per user on average (8.1 ideas). The 7th and 8th communities contain the newest users with less than one idea per user on average, with the users of the 7th community being the oldest of the two (1751 days on average) and having more  Table 8 should be read together with Fig. 8. A stakeholder sees that community 5 (blue) generates 387 ideas and community 7 (green) generates 481 ideas. These are the most prominent communities with the best ideas. Figure 8 shows the stakeholder that those two communities interact with each other and with the other communities. For the stakeholder, it becomes clear that the users inside community 5 and 7 are the most valuable ones. Focusing attention on the ideas proposed by communities 5 and 7 minimizes analysis time and improves operational efficiency for the company.
In the same way, Table 9 shows that communities 2, 3, and 5 produce the largest number of the worst ideas. The company can save time by bypassing all those ideas and the users who are more likely to continue generating bad ideas.

Linear models
A linear model was constructed to satisfy objective three, which is to determine the relationship between ideas'sentiment scores and the discussion frequency of crowdsourcing users. Tables 10 and 11 show the coefficients with standard errors in parentheses for each simulation. The final column contains the coefficients and standard errors of all the simulations. The standard error of the final result is greater than any of each simulation. However, the coefficients remain statistically significant. The standard error also remains in the same order of magnitude. Table 10 presents the coefficients and their standard errors for agreeable ideas. Table 11 shows the coefficients and standard errors for disagreeable ideas. Figures 10 and 11 show a quadratic relation between the sentiment of the language used by users and the number of discussions exchanged in the case of agreeable/disagreeable ideas. This association is statistically significant, as seen in Table 12. This result confirms the third hypothesis. The generalized least squares equations can be written as: For agreeable ideas, with a frequency, the score will be 3.05 − 3.54 · 3 + 1.29 · 9 = 4.04 . That is, the score of the idea will increase by 4.04 points. For disagreeable ideas, 2.76 − 2.98 · 3 + 0.98 · 9 = 2.64 . The perception of the disagreeable idea will increase by 2.64 points, which remains below the value of agreeable ideas. Given that this is a quadratic relationship, the higher the frequency, the bigger the impact on the ideas. For agreeable ideas, the relationship has a positive impact on the score. For disagreeable ideas, there is a positive impact, smaller than agreeable ideas, that reinforces a bad perception and a bad evaluation of the score.

Conclusion
Firms recognize the intrinsic value in the sentiments and emotions of users in response to prospective ideas, and their comments can be valuable for innovation at the initial ideation stage. Firms' increased use of crowdsourcing is a clear indication that they recognize the value of cooperation. While the use of crowdsourcing continues to grow, there is an evident paucity in the development of appropriate empirical methods to analyze that data effectively. The research and results presented here contribute to the advancement of both crowdsourcing scholarship and empirical methods used to examine that phenomenon by illustrating that a plurality of methods yields a greater depth of understanding of the phenomenon. While some scholars chose one empirical methodology paradigm to analyze data from crowdsourcing, we decided to employ a few different methods to achieve a more  Table 12 Generalized least squares estimated model with sentiment scores as outcome for agreeable and disagreeable ideas *** p < 0.001 , ** p < 0.01 , * p < 0.05. in-depth insight and benefit from a multitude of perspectives. Each technique used in this study added another meaningful dimension to the study of crowdsourcing. Sentiment analysis, a psychology-based analysis tool, yielded rich data that firms can use to commercialize users' ideas through a systematic analysis of language. Social network analysis provided valuable insight into the dynamics among and between users that can aid firms in determining appropriate parameters for both the crowdsourcing platform and the subsequent analysis of data. Finally, the ordinary least squares method provided for a robust statistical analysis that further solidified the validity of the results of the study. Social network analysis allows the interaction of users to be depicted in an interactive graph with users clustered into communities. The graphs provide a visual tool that aids researchers to aptly identify influential communities that are at the center of discussions on the platform. Communities that are isolated or have fewer interactions can be filtered to target only the group of users who suggest valuable ideas. This finding can help firms implement suggestions for idea innovations and service improvements more efficiently, adding value to new product generation and innovation. Finally, we provide empirical evidence that indicates that positive language in a discussion between two individuals increases with the frequency of the conversations exchanged. This finding suggests that, over time, more frequent interactions can influence individuals' evaluation of ideas within the community.

Standard errors in parentheses
Like all empirical research studies, there were some limitations. When evaluating idea polarity, only one layer of the wheel of emotions from [41] was used to simplify word polarization. Although this approach is commonly used to achieve parsimonious and generalizable results, additional layers may have resulted in more specific results. Also, the database used in this study is vast, so we had to employ bootstrapping to obtain accurate samples to avoid the use of supercomputers. Some scholars criticize bootstrapping as a time-consuming process that makes assumptions, such as independent samples, that could skew results. However, this commonly used approach is necessary when the use of supercomputers is prohibitive.