1 A Collective Picture of What Makes People Happy: Words Representing Social Relationships, not Money, are Recurrent with the Word ‘Happiness’ in Online Newspapers

: The Internet allows people to freely navigate through news and use that information to reinforce or support their own beliefs in, for example, different social networks. In this chapter we suggest that the representation of current predominant views in the news can be seen as collective expressions within a society. Seeing that the notion of what makes individuals happy has been of increasing interest in recent decades, we analyze the word happiness in online news. We first present research on the co-occurrence of the word happiness with other words in online newspapers. Among other findings, words representing people (e.g., “mom”, “grandmother”, “you”/”me”, “us”/”them”) often appear with the word happiness. Words like “iPhone”, “millions” and “Google” on the other hand, almost never appear with the word happiness. Secondly, using words with predefined sets of psycholinguistic characteristics (i.e., word-norms measuring social relationships, money, and material things) we further examine differences between sets of articles including the word happiness and a random set of articles not including this word. The results revealed that the “happy” dataset was significantly related to social relationships word-norm, while the “neutral” dataset was related to the money word-norm. However, the “happy” dataset was also related to the material things word-norm. In sum, there is a relatively coherent understanding among members of a society concerning what makes us happy: relationships, not money; meanwhile there is a more complex relationship when it comes to material things. The semantic method used here, which is particularly suitable for analyzing large amounts of data, seems to be able to quantify collective ideas in online news that might be expressed through different social networks.


Introduction
The Internet allows people to freely navigate through online news and use that information to reinforce or support their own beliefs (Tewksbury & Althaus, 2000;Althaus & Tewksbury, 2002).Along this thinking, we have earlier suggested (Garcia & Sikström, 2013a) that current and predominant views in a society tend to perpetuate themselves through, besides inter-and intrapersonal conversations, narratives in newspapers, popular songs and books, movies, and television, and in recent decades, even in blogs and other online media (see Landauer, 2008).This representation can be seen as the vox populi (or the voice of the people) in a certain culture; a notion that becomes part of that culture's knowledge about the world (Giles, 2003).From a statistical point of view, because we share experiences with many others, there should be relatively good agreement among members of a society concerning different topics (Landauer, 2008).The semantic knowledge of specific topics and abstract ideas become the current and predominant views of society's collective picture of specific topics, recursively feeding on itself (Landauer, 2008).See Figure 1.1.In the first part of this chapter we present how we tested our suggestions by analyzing the co-occurrence of the word happiness (lycka in Swedish) with other words in Swedish online newspapers (Garcia & Sikström, 2013a).The notion of happiness has for decades been of scientific and also of popular interest, and thus a natural choice for our study.Specifically, we hypothesized that by investigating the frequency or infrequency of the word happiness in relation to other words in the same language, we would be able to quantify a collective picture of "what makes people happy".This picture might be a belief or notion shared by the many and the one, but not necessary accurate in what really makes people happy (see for example Gilbert, 2007, who suggests that humans are actually inaccurate at imagining how happy we will be in the future or if we get things which we assume will make us happy).Although measurement of people's subjective experience of happiness using self-report is a cumulated or a collective result based on a large number of individuals (Gilbert, 2007), it is different from our proposed quantification of a collective theory of "what makes us happy".This is analogous to the difference between commuting to work by public transport and driving your own car, in which the former is a collective type of transport available for everyone.In other words, this representation of "what makes us happy" is collective in nature because it is a picture communicated by relatively few individuals to the masses.In the second part of this chapter we expand our earlier research by using words with predefined sets of psycholinguistic characteristics (i.e., word-norms) to further examine differences between sets of articles including the word happiness ("happy" dataset) and a random set ("neutral" dataset) of articles not including this word.

The Co-Occurrence of the Word Happiness With Other Words in Online Newspapers
In our original article (Garcia & Sikström, 2013), news articles were collected from the fifty largest daily newspapers in Sweden published online during 2010.These online newspapers are in most cases also published in printed format, making them representative of public media in Sweden.We randomly selected 3,000 of these articles that included the Swedish word "lycka" for the "happy" dataset, and 3,000 articles that did not include this word for the "neutral" dataset.
The data were analyzed with words as the basic unit of analysis.In total there were 1,065,429 words in the "happy" dataset, 493,927 words in the "neutral" dataset, and 93,093 unique words in both datasets.A frequency vector was generated consisting of the number of occurrences of each unique word in the "happy" dataset, and a similar vector was generated for the "neutral" dataset.For each unique word, a 2-by-2 chi-square test was conducted, consisting of four frequencies: the frequencies of the word in the two datasets and the number of remaining words in the two datasets.The resulting p-values were corrected for multiple comparisons using the Bonferroni method (i.e., multiplying each p-value by N).Words that were significant (at the 0.05 level) were selected for further analyses.The resulting significant words were divided into different word classes.Due to the large number of significant words, we find it appropriate to only present the p-value, whilst omitting other data that are typically presented in chi-square analysis, such as the chi-square value and the number of occurrences in each cell.In Table 1, the words are ordered by increasing p-values, where these Bonferroni corrected p-values were in the range 0.00001 < p < 0.05.The total number of words, in all chi-square tests, is the sum of the words in both datasets.
In Table 1, we present the results for pronouns, proper names, and nouns.Proper names associated with the "happy" dataset were almost exclusively names of people, where the Swedish Crown Princess' name "Victoria" was the most discriminative word followed by proper names associated with sports, especially soccer, for example, Zlatan, Lagerbäck (the former coach of Sweden's national soccer team), Drogba, Argentina, and Nigeria.Proper names discriminative for the "neutral" dataset were almost exclusively company names, where the most significant companies were in the IT field.Although these results were obviously inflated by the overrepresentation of the Swedish royal wedding and the FIFA World Cup in the media during 2010, the results with regard to relationships are in accordance with current findings suggesting that happy individuals always report strong positive social relationships (Diener & Seligman, 2002, 2004).Moreover, research on widows (Lucas, Clark, Georgellis & Diener, 2003) and divorced people (Clark, Diener, Georgellis & Lucas, 2008) has shown great declines in happiness precisely before and after the loss of a significant other.Also in line with this, the results regarding pronouns show that almost all pronouns discriminated between the datasets, and all of the significant pronouns were associated with the "happy" dataset (e.g., I, you, mine, me, yours, and she) and the results regarding nouns associated with the "happy" dataset were largely semantically related to love or people (e.g., people, hug, love, dad, grandmother, mom).In contrast, the "neutral" dataset was associated with nouns representing money or companies (e.g., crowns, millions, billions).
These results lead us to suggest that a collective theory of "what makes us happy" reflects research based on self-reports showing that people who put more value in love and relationships rather than money are happy (Diener & Biswas-Diener, 2002).On a larger level, research has only found small correlations between income and happiness within nations-the correlations are larger in poor nations, and the risk of unhappiness is much higher for people living in poverty.Moreover, economic growth in most economically developed societies has been accompanied by only small increases in happiness levels (Diener & Seligman, 2002).In other words, as long as basic needs are met, money or material things do not seem to increase happiness levels.Accordingly, our results do not mean that money and material things make us unhappy, rather that specific words representing money and material things are not associated with happiness in the media.
Our study was an addition to recent research on happiness using large datasets of texts (e.g., Dodds & Danforth, 2010;Dodds, Harris, Kloumann, Bliss & Danforth, 2011;Garcia & Sikström, 2013ab;2014;Schwartz, Eichstaedt, Kern, Dziurzynski, Ramones et al., 2013) and also complemented self-reporting techniques by offering an approach to the investigation on how "what makes us happy" is presented through the mass media to large segments of a society at the same time.Earlier theories of individual unconsciousness and consciousness have suggested that humans possess a collective level of awareness or knowledge.Carl Jung (1968), for example, proposed a collective unconscious consisting of memories accumulated throughout human history.These memories are represented in archetypes that are expressed in the symbols, myths, and beliefs found in many cultures, such as the image of a god, an evil force, the hero, the good mother, and the quest for self-unity and wholeness.Similarly, the French sociologist Émile Durkheim (1965) coined the term collective consciousness, which refers to the shared beliefs and moral attitudes that serve as a unifying force within a society.Determining whether this representation of happiness in our study is implicit (as Jung's theorized collective unconscious) or explicit is beyond the scope of our current research.Nevertheless, although at a collective level people probably understand the influence of close and warm relationships on their own happiness, they might not be consciously aware that such relationships are necessary for happiness (Lyubomirsky, 2007).After all, the importance of social relationships to a happy life is indeed epitomized in a simplified and larger-than-life manner in the standard ending of many fairy tales: "…and they lived happily ever after".Likewise, most people seem to understand that money can't buy happiness…..or love, as in the in the famous Beatles song "Can't buy me love".Moreover, we have suggested that the representation of a collective picture of "what makes us happy" in the media seems to be a notion that does not fit with theories of happiness focusing on individual differences (e.g., the theory of "Virtues in Action" by Peterson & Seligman, 2004) or for determining whether focusing on intentional activities is related to a happy life (e.g., Diener & Oishi, 2005).
In sum, our findings seem to mirror a collective theory of "what makes us happy" or an agreement among members of a community about what makes people happy: relationships, not money or material things.This picture is presented to all members of the society through newspapers and other media, making it collective in nature.Because this information is accessible through the Internet, it might be used by readers to reinforce or support their own beliefs and express those beliefs when social networking.In the next part of this chapter, we present new analyses of the same dataset using quantitative semantics and words with predefined sets of psycholinguistic characteristics (i.e., word-norms) measuring social relationships, money, and material things.We use this approach to further investigate if a collective picture of "what make us happy" suggests that social relationships, rather than money are of more importance in enabling happiness.

Measuring Happiness' Relationship to Social Relationships, Money, And Material Things: The Word-Norm Approach
The method we employed to quantitatively re-analyse the news texts and word-norms is called Latent Semantic Analysis (LSA; Landauer & Dumais, 1997).This method involves applying an algorithm to create semantic representations of the various semantic based contents.In short, the LSA-algorithm assumes that words that occur close to each other in text can be used as a source of information; which is used to create multi-dimensional semantic representations.That is, the context that words occur in normally consists of a meaning that more often than not corresponds to the meaning of the word (Landauer & Dumais, 1997;Landauer, 2008;Landauer, McNamara, Dennis & Kintsch, 2008).As a result, the content can be represented as a vector in a multi-dimensional semantic space.In turn, the semantic representations of single words can be used to summarize larger text by adding the representations, and normalizing the length of the vectors to one.The similarity between semantic representations can be measured by the cosines of the angle between the vectors, which is mathematically equivalent to multiplying each dimension with each other and adding the resulting products.This similarity measure can then be used in standard statistical procedures such as correlations, regressions, t-tests, analysis of variance, etc (for studies applying regression analysis using semantic representations see: Karlsson, Sikström & Willander, 2013;Garcia & Sikström, 2013ab, 2014;Gustafsson, Sikström & Lindholm, in press;Rosenberg, Sikström & Garcia, 2013;Roll, Mårtensson, Sikström, Apt, Arnling-Bååth & Horne, 2011).Semantic representations for the text content under investigation, which here includes the word contexts from the articles as well as the word-norms, were carried out using Semantic Excel, which is a web-based software developed by the last author of this chapter (S.Sikström).This software is specifically developed to create and analyse semantic representations and can be found at: www.semanticexcel.com.
The use of word-norms to analyze the previously used news dataset allowed us to emulate the interaction between readers and the text in the online newspapers.Participants, seen as readers of news, were asked to generate words they associated to hypothesis-relevant key words: "social relationships" (the Swedish word: sociala relationer); "money" (the Swedish word: pengar); and "material things" (the Swedish word: "materiella ting").These word-norms enabled us to examine which of the happiness or random-word contexts they related the most to.Thus we could hypothesize that the social relationships word-norm will relate more to the happiness contexts than to the random-word contexts -the opposite pattern will occur for the money and material things word-norms.

Participants and procedure
We recruited participants using the social network Facebook.The sample consisted of 10 females and 5 males.The age ranged from 22 to 64 years with a mean age of 34 (sd = ±12.7)years.Participants had a wide range of employment or study interest: 8 participants reported to study (2 philosophy, 2 law, 1 psychology, 1 ethnology, 1 graphic design, and 1 did not give an answer), 6 being employed (2 teacher, 1 PhD student, 1 economist, 1 carpenter and 1 dentist) and 1 being retired (from being a teacher).In the beginning of the study participants were informed about their right to withdraw at any time and that there responses will be kept confidential, after which they were asked to consent to the study.They were subsequently asked to spend approximately two to three minutes generating words that they associated with the key words that they were provided; these were introduced in random order between participants.The survey took approximately five minutes; at the end of the survey participants were thanked and debriefed.

Creating Semantic Representations
To create high quality semantic representations one typically needs a larger dataset than what is usually collected in experimental studies.Therefore we first created a word space using a very large text corpus; and then use this as a foundation for the analyses of the experimental data.The word space that comes with the semantic excel software was used (www.semanticexcel.com).This space was created using a Google N-gram database for Swedish text, which is based on 1 Terabyte of text data (see the Google N-gram project: http://ngrams.googlelabs.com).From this database, a co-occurrence matrix was made comprised of 5-grams contexts; the rows consisted of the 120000 most common words and the columns of the 10000 most common words in the N-gram database.We did not use a stemming algorithm, mainly because word stems contain additional information that might be lost during stemming, and also because the large size of the Google N-gram database makes the space large enough to provide sufficient data on words with unusual stems.Each cell in the co-occurrence matrix represented the frequency of the 4 context words in the N-gram on the columns relative to one target word on the rows.Finally, the cells where normalized by calculating the logarithm plus one.
Singular value decomposition was applied to compress the information of the matrix whilst preserving as much information as possible.This final matrix is called the semantic representation of each word.A synonym test was used to establish the best solution of dimensions; the highest score (and hence the optimal number of dimensions) was found at 256 dimensions.Accordingly, the current analysis yielded a semantic quantification that represents the most frequent words used in the Swedish N-gram database, where each word is best described with a high dimensional vector normalized to the length of one.

Creating Semantic Scales: The word-norms
The number of words associated with social relationships was 171, including words such as 'friend', 'support', 'enemies', and 'activities'.Money included 203 words, such as, 'wealth', 'power', 'society', and 'possibilities'.Material things consisted of 156 words, such as, 'joy', 'jealousy', 'craving', and 'abundance'.Words generated by more than one participant (for example 'friendship', 'value'; 'consumption') were kept in for the analyses, so that duplicated words were weighted heavier than nonduplicated words.Words that were misspelt but where the meaning was clearly understandable were corrected prior the analyses; 1 'word' within the material things dataset was unrecognizable and thus removed.The final three lists of words were separately summarized by aggregating their associated semantic representation, and then normalized to the length of one.This procedure created three word-norms related to social relations, money, and material things.

Applying the semantic scales on articles including/not including the word Happiness
Two sets were created.Articles including the word happiness (i.e., "lycka" in Swedish) and articles not including this word.This dataset included 1867 random documents and 4185 happiness documents.The words in each article were summarized by aggregating their associated semantic representation, and then normalized to the length of one.This procedure created a semantic representation for each article.We then measured the semantic similarity between the semantic representation of the articles, with the semantic representations related to social relations, money, and material things.The semantic similarity was measured by the cosines of the angel between the two associated semantic representations (i.e., vectors), which can be mathematically calculated as the dot product (Landauer, McNamara, Dennis & Kintsch, 2008).We call the resulting values the semantic scale or word-norms of social relations, money, and material things respectively.These scales were calculated for each article.

Results
Social relationships.The analysis examining the one-tailed hypothesis that the value on the social relationship semantic scale is higher for happiness contexts (mean = 0.301; sd = 0.0443) compared to random contexts (mean = 0.228; sd = 0.052), revealed a significant difference as measured by a t-test in the hypothesized direction (p < .0001;t = 56.233).
Money.The analysis examining the one-tailed hypothesis stating that the moneynorm will be more related with the random (mean = 0.251; sd = 0.048), as compared with the happiness (mean = 0.249; sd = 0.0401) word contexts were also significant (p = .032;t = 1.859) and therefore accepted.
Material things.In direct contrast to the one-tailed hypothesis, the analysis revealed that material things were in fact significantly more related to happiness (mean = 0.256; sd = 0.039749) rather than the random (mean = 0.218; sd = 0.034) word contexts.Hence the hypothesis was rejected (p < .0001;t = -36.4455).

Discussion
Overall the results supported the hypothesis in that happiness contexts, as compared with random-words contexts, relate more to the concept of social relationships and less with the concept of money.However, the results from the material things word-norm was, in contrast to the hypothesis, related to happiness, as compared with random words contexts; leading to some interesting enquiries.This result, for example, might reflect that substantial parts of public domains embrace and uphold a consumer society and that this 'way of life' is depicted as a happy life.It might also illustrate that the relationship between happiness and money versus material things appear to be rather intricate and complex.
A visual inspection of the words included in the material thing word-norm appears to indicate that it might capture, at least, two rather separate aspects.In our original article (Garcia & Sikström, 2013a) we labelled material things those words representing advanced technologies (e.g., iPad, computer) and company names (e.g.Windows, Sony and even 'company').This is in accordance with some of the words in the material things word-norm presented here (e.g., computers, buy, fashion, commercial, cars, boats, diamonds, etc); which clearly overlap with the money wordnorm.Noticeable, the material things word-norm also included positive emotion words (e.g.joy, satisfying, fun, and wonderful); whilst also taping into what perhaps could be categorized as a wider sense of basic needs for life and perhaps even happiness (e.g., life, need, choice availability, house, the home, identity, and fellowship).The results appear to tap into what Aristotle (trans.2009) exclaimed: "The life of moneymaking is one undertaken under compulsion, and wealth is evidently not the good we are seeking; for it is merely useful and for the sake of something else" (p.7).Perhaps the material things word-norm taps in to this 'something else'; such as basic needs for a healthy and long-lasting happy life.In other words, it is what you do with the money that influences your level of well-being and happiness; such as spending it on safety and leisure time with important others; rather than consuming goods that lead to an ever ending increase in wants and desires (e.g., Easterlin, 1974;Lyubomirsky, 2007).

Limitations, Strenghts, And Suggestions For Future Studies
It might be worth highlighting the consideration of using comparison contexts.That is, it might be useful to select more than the current random-word contexts employed in this study.For example, as we here examined happiness in the context of datasets including this specific emotion and compared with unspecific random-words datasets, it could be argued that our findings to some degree reflect emotions in general rather than happiness specifically.That is, emotions are inherently ascribed to humans as a characteristic and are not applied to material things or for example companies, as these do not experience emotions.Although it should be noted that countries such as Argentina, Nigeria, and Kuba were associated with the happiness contexts.However, future research could create several comparison contexts; for example, including random contexts constrained to emotion-contexts that include various emotion words or alternatively compare specific emotions such as happiness and unhappinesscontexts.This would further bring nuances and details of the nature to any concept under investigation.However, it is worth pointing out that the word-norm analysis actually showed that, although emotions are not ascribed to immaterial things, the material things word-norm was actually related to happiness context but not related to the random words contexts.
Similarly it is important to remember that overall happiness contexts are more related to social relationships than random words; however, certain kinds of social relationships are probably harmful and decrease happiness (e.g., see how depressive symptoms spread person-to-person through social networks, Rosenquist, Fowler & Christakis, 2011).Hence as a follow up, future research should examine word-norms that are more detailed, such as 'supportive social relationships' versus 'destructive social relationships' in, for example, happiness versus unhappiness contexts.
Furthermore, our new results also illustrate the strengths of using LSA in conjunction with our previous method investigating the frequency of words (Garcia & Sikström, 2013a).This approach is suitable in highlighting the Text-Norm specific interaction.That is, it can emulate the interplay between the one who made the text in the first place and how different groups of readers (e.g., students, laymen, or experts) might interpret it differently.It is important to point out that the collective picture of "what makes us happy" as represented in the online news (i.e., that relationships not money makes us happy) from our original study, is not necessarily what people actually use when networking.The results using the word-norms actually suggest that people might express happiness on the Internet by associating happiness to plain material things (e.g., computers, buy, fashion, cars, boats, diamonds, etc), positive emotion words (e.g.joy, satisfying, fun, and wonderful), identity and basic needs (e.g., life, need, choice availability, house, the home, identity, and fellowship).

Concluding Remarks
With regard to a collective picture of "what makes us happy", there is a relatively coherent understanding among members of a society concerning what makes us happy: relationship, not money.Nevertheless, there is a more complex relationship when it comes to material things.From a methodological perspective, the different approaches used here seem to be able to quantify collective ideas in online news, blogs, and other type of Internet medium.These methods can be suggested to study how people express themselves through different social networks.Using word-norms can be a straightforward and sound way to complement keywords interpretations.Mainly because the research then becomes more distanced from the interpretation processes, whilst at the same time it can reflect different interactions, such as between different texts (e.g., news articles, blogs, and twitter) and various readers (laypersons, students, or difference experts).Indeed, we believe it might be worth pointing out that this norm-based approach has the potential to constitute a sound complement in examining and interpreting keywords that are different between word texts, in particularly for analysis of large datasets.
"I cannot think it unlikely that there is such a total book on some shelf in the universe.I pray to the unknown gods that some man--even a single man, tens of centuries ago-has perused and read this book.If the honor and wisdom and joy of such a reading are not to be my own, then let them be for others.Let heaven exist, though my own place may be in hell.Let me be tortured and battered and annihilated, but let there be one instant, one creature, wherein thy enormous Library may find its justification." In "The Library of Babel" by Jorge Luis Borges.

Figure 1 . 1 .
Figure 1.1.The individual's and society's ideas about happiness expressed in the media, which generates a collective theory of happiness; in turn, feeding the original ideas found at the individual and society levels

Table 1 .1. Words
discriminating between articles including or not including the word happiness (Printed with permission from D.Garcia and S. Sikström)

Word classes "Happy" dataset Articles including the word happiness "Neutral" dataset Articles NOT including the word happiness
Note.The words are divided into the word classes proper nouns, pronouns, and nouns (other word classes are removed).The words are ordered by increasing p-values in the range 0.000001 < p < 0.05 and only the approximately 40 most significant words are included.All words are significant following corrections for multiple comparisons (Bonferroni).English translation of pronouns and nouns in parentheses