Skip to main content
Log in

The cultural environment: measuring culture with big data

  • Published:
Theory and Society Aims and scope Submit manuscript

Abstract

The rise of the Internet, social media, and digitized historical archives has produced a colossal amount of text-based data in recent years. While computer scientists have produced powerful new tools for automated analyses of such “big data,” they lack the theoretical direction necessary to extract meaning from them. Meanwhile, cultural sociologists have produced sophisticated theories of the social origins of meaning, but lack the methodological capacity to explore them beyond micro-levels of analysis. I propose a synthesis of these two fields that adjoins conventional qualitative methods and new techniques for automated analysis of large amounts of text in iterative fashion. First, I explain how automated text extraction methods may be used to map the contours of cultural environments. Second, I discuss the potential of automated text-classification methods to classify different types of culture such as frames, schema, or symbolic boundaries. Finally, I explain how these new tools can be combined with conventional qualitative methods to trace the evolution of such cultural elements over time. While my assessment of the integration of big data and cultural sociology is optimistic, my conclusion highlights several challenges in implementing this agenda. These include a lack of information about the social context in which texts are produced, the construction of reliable coding schemes that can be automated algorithmically, and the relatively high entry costs for cultural sociologists who wish to develop the technical expertise currently necessary to work with big data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Notes

  1. International Data Corporation, “The 2011 Digital Universe Study: Extracting Value from Chaos,” June, 2011. See also Christopher R. Johnson, “How Big is Big Data?” Lecture at the University of Michigan’s Cyber-Infrastructure Conference, November 7th, 2012.

  2. Ibid.

  3. The US National Science Foundation invested more than $15 million in Big Data projects in 2012, and will easily surpass this amount in upcoming years due to the development of new infrastructure for funding big data projects in collaboration with Britain’s Economic & Social Research Council, the Netherlands Organization for Scientific Research, and the Canada Foundation for Innovation, among many others.

  4. Jesse Alpert and Nissan Hajaj, “We knew the web was big…” Official Google Blog, July 25th, 2008 (http://googleblog.blogspot.com/2008/07/we-knew-web-was-big.html accessed January 2012).

  5. Pew Internet & American Life Project, February 1st, 2012.

  6. “Social Networking Popular Across Globe,” Pew Research Global Attitudes Project, December 12, 2012.

  7. Moreover, the US Library of Congress recently announced plans to release a database of every single Twitter message ever made. Current estimates place the total number of tweets that might be archived at more than 170 billion.

  8. Web-scraping technologies have facilitated the collection of remarkably large datasets. Golder and Macy (2011), for example, recently conducted a study of more than 500 million Twitter messages produced in more than 84 countries over a 2 year period.

  9. Though access to the entire Google book archive is limited by pay walls designed to protect copyright privileges, Google has released the entire dataset in “ngram” format, which allows scholars to analyze them via the automated text analysis tools discussed in further detail below.

  10. See, for example, the Dataverse Network, the Interdisciplinary Consortium for Political and Social Research, and the United Kingdom’s Qualidata archive.

  11. The neologism “big data” has come to refer to many different types of data. Here, I use the term to refer to the increasingly large volume of text-based data that is often-though not always- produced through digital sources. As the remainder of this manuscript describes, these data are also unique because they are “naturally occurring,” unlike survey data which result from the intrusion of researchers into everyday life.

  12. Exceptions described in additional detail below include Franzosi (2004), Lewis et al. (2008), Bail (2012), Bail (forthcoming) and several other works in progress.

  13. “Real time” refers to the collection, presentation, or analysis of data at or very near the time it is being produced by social actors.

  14. For a technical overview of techniques designed for analysis of Big Data, see Manning and Schuetze (1999).

  15. For an overview, see Franzosi (2009).

  16. See also Ghaziani and Baldassarri (2011).

  17. See also Mark (2003).

  18. One exception is Evans and Kay’s (2008) study of field overlap.

  19. Exceptions include Mohr and Guerra-Pearson (2010) and Bail (2012).

  20. While automated data extraction methods are particularly useful for mapping the contours of discursive fields, it is important to note that such techniques do not capture the deeper preconscious cultural elements that undergird social fields as Bourdieu and others have theorized them (e.g., Bourdieu 1990; Fligstein and McAdam 2011; Martin 2003). I return to the question of whether big data techniques can be leveraged to classify such cultural elements in the following section as well as my discussion and conclusion.

  21. For example, one might define a discursive field by identifying all texts with a certain set of keywords or within a certain search index offered by text archives.

  22. Facebook’s API requires user-authentication to access these data. Therefore, one must either access only publicly available data or obtain an authentication token from a Facebook page’s owner. Elsewhere, I argue that app-based technologies are the most promising data collection tools to overcome such challenges. See Bail (2013b).

  23. For a recent review of this literature, see Lamont (2012).

  24. Notable exceptions discussed in further detail below include Mohr (1998), Franzosi (2004), Bearman et al. (1999), Bearman and Stovel (2000), Smith (2007), and Bail (2012).

  25. Mohr (1998) made early calls for cultural sociologists to adopt these methods to classify meaning structures, yet they were mostly ignored even as they become widely used by cognitive anthropologists (e.g., D’Andrade 1995).

  26. For an overview of this field, see Blei (2012).

  27. For a technical overview of LDA, see Blei et al. (2003).

  28. A number of scholars have proposed validity measures for LDA, most recently Blei (2012). Most of these emphasize comparisons of topic models via log-likelihoods or harmonic means, yet most proponents of topic modeling agree that they must also be validated via qualitative inspection of individual topics within subsets of large samples.

  29. For example, see Blei and Lafferty (2006), Wallach (2006), Chang et al. (2009), and Hopkins and King (2010).

  30. See also Grimmer (2010) and Quinn et al. (2010).

  31. In particular, Hopkins and King (2010) argue that coding more than 500 documents produces diminishing returns in the reliability of automated text analysis.

  32. For example, a supervised topic model can be used to determine whether websites should be included in a directed web-crawl such as SnowCrawl to capture sites that discuss a theme or topic without using a single key-word.

  33. See Baumer et al. (2013).

  34. Consider, for example, the diaries analyzed in Goffman (1963) or the newspaper clippings in Goffman (1974). Also, textual descriptions of face-work or other unspoken forms of bodily interaction in the form of field notes could potentially be analyzed using topic models.

  35. For a discussion of the challenges of achieving high levels of inter-coder reliability in cultural analysis, see Krippendorff (2003).

  36. But see Cerulo (1998), Wagner-Pacifici (2010), and Bail (2012).

  37. Still, historical analyses with big data are limited by the availability of texts produced during this period that were amenable to digitization. This presents a number of important limitations, including pervasive illiteracy during early historical periods as well as the tendency for only elite accounts of historical events to survive the passage of time. Still, comparative-historical sociologists face these problems regardless of whether they are working with big data. Furthermore, primary documents obtained through archival analysis can be easily digitized through photographs, scanning, and text-recognition technologies.

  38. If key actors or events are already known, simple key word searches or Global Regular Expression Print (GREP) commands may also be used to identify them. If actors or events are not known, they can be identified through keyword counts that remove common words such as “the” or “and.” Once actors or events are defined, topic models may be used to identify them as well. A number of computer scripts have also been recently developed to identify names within big data without such intermediary steps such as the Natural Language Toolkit and the Stanford Parser.

  39. See also Sewell (1996) and Wagner-Pacifici (2010).

  40. On the concept of cultural holes, see also Lizardo (in this issue).

  41. It is also worth noting that texts that cannot be collected because they are not in the public domain may ultimately have less impact upon the evolution of broader cultural domains precisely because they are hidden from public view. This underlies a broader pragmatist argument about the need to focus attention upon consequences of social action (e.g., Johnson-Hanks et al. 2011; Tavory and Timmermans 2013). An interesting analogue is the debate about the social construction of ethnicity via the enumeration of different groups by the US Census (cf. Loveman and Muniz 2007). I thank Andy Perrin for bringing this issue to my attention.

  42. For a detailed analysis of conceptual and methodological ambiguities in the measurement of frames, see Scheufele (1999).

  43. See Baumer et al. (2013)

  44. Efforts are currently underway to make the collection and analysis of big data possible for those without a computer programming background. Gary King and colleagues are producing a web-based tool named “Consilience” that will enable cluster analysis of unstructured text. Primitive forms of topic modeling and sentiment analysis are available via a variety of web-based software programs as well such as www.discovertext.com . Finally, there is a variety of high quality tutorials available online for those who wish to develop basic programming skills for working with big data. For example, see http://nealcaren.web.unc.edu/big-data/, and http://www.chrisbail.net/p/software.html. A complete list of tutorials is available at http://www.chrisbail.net/p/big-data.html.

  45. See Steve Lohr, “The Age of Big Data,” The New York Times, February 11, 2012.

References

  • Abbott, A. (1995). Things of boundaries. Social Science Research, 62(4), 857–882.

    Google Scholar 

  • Abbott, A. (1997). On the concept of turning point. Comparative Social Research, 16, 85–106.

    Google Scholar 

  • Abbott, A. (2001). Chaos of disciplines. Chicago: University of Chicago Press.

    Google Scholar 

  • Agnew, J., Gillespie, T., Gonzalez, J., & Min, B. (2008). Baghdad nights: Evaluating the US military “surge” using nighttime light signatures.

  • Alexander, J. (2006). The civil sphere. Oxford: Oxford University Press.

    Book  Google Scholar 

  • Alexander, J., & Smith, P. (2001). The strong program in cultural theory: Elements of a structural hermeneutics. In J. H. Turner (Ed.), Handbook of sociological theory (pp. 135–150). New York: Springer.

    Google Scholar 

  • Armstrong, E. A. (2002). Forging gay identities: Organizing sexuality in San Francisco, 1950–1994. Chicago: University of Chicago Press.

    Google Scholar 

  • Bail, C. (2012). The fringe effect: civil society organizations and the evolution of media discourse about Islam, 2001–2008. American Sociological Review, 77(7), 855–879.

    Article  Google Scholar 

  • Bail, C. (2013a). Winning minds through hearts: Organ donation advocacy, emotional feedback, and social media. Working Paper, Department of Sociology, University of North Carolina at Chapel Hill.

  • Bail, C. (2013b). Taming big data: Apps and the future of survey research. Working Paper, Department of Sociology, University of North Carolina, Chapel Hill.

  • Bail, C.A. (forthcoming). Terrified: How anti-muslim organizations became mainstream. Princeton University Press, Princeton, NJ.

  • Barth, F. (1969). Ethnic groups and boundaries: The social organization of cultural difference. Boston: Little, Brown.

    Google Scholar 

  • Bartley, T. (2007). How foundations shape social movements: the construction of an organizational field and the rise of forest certification. Social Problems, 54(3), 229–255.

    Article  Google Scholar 

  • Baumer, E. P. S., Polletta, F., Pierski, N., Celaya, C., Rosenblatt, K., & Gay, G. K. (2013, February). Developing computational supports for frame reflection. Retrieved from http://hdl.handle.net/2142/38374.

  • Bearman, P., & Stovel, K. (2000). Becoming a Nazi: a model for narrative networks. Poetics, 27(2), 69–90.

    Article  Google Scholar 

  • Bearman, P., Faris, R., & Moody, J. (1999). Blocking the future: new solutions for old problems in historical social science. Social Science History, 23(4), 501–533.

    Google Scholar 

  • Benford, R., & Snow, D. (2003, November 28). Framing processes and social movements: An overview and assessment. Review-article.

  • Biernacki, R. (2012). Reinventing evidence in social inquiry: Decoding facts and variables. New York: Palgrave Macmillan.

    Book  Google Scholar 

  • Blei, D. (2012). Probabilistic topic models. Communications of the ACM, 55(4), 77–84.

    Article  Google Scholar 

  • Blei, D., & Lafferty, J. (2006). International Conference on Machine Learning, ACM, New York, New York, 113–120.

  • Blei, D., & Lafferty, J. (2007). A correlated topic model of science. The Annals of Applied Statistics, 1(1), 17–35.

    Article  Google Scholar 

  • Blei, D., & McAuliffe, J. (2010). Supervised topic models. arXiv:1003.0783. Retrieved from http://arxiv.org/abs/1003.0783.

  • Blei, D., Ng, A., & Jordan, M. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3, 993–1022.

    Google Scholar 

  • Bollen, J., Mao, H., & Zeng, X. (2011). Twitter mood predicts the stock market. Journal of Computational Science, 2(1), 1–8. doi:10.1016/j.jocs.2010.12.007.

    Article  Google Scholar 

  • Bourdieu, P. (1975). The specificity of the scientific field and the social conditions of the progres of reason. Social Science Information, 14(6), 1–19.

    Article  Google Scholar 

  • Bourdieu, P. (1985). The social space and the genesis of groups. Theory and Society, 14(6), 723–744. doi:10.1007/BF00174048.

    Article  Google Scholar 

  • Bourdieu, P. (1990). Homo Academicus (1st ed.). Stanford: Stanford University Press.

    Google Scholar 

  • Cerulo, K. A. (1998). Deciphering violence: The cognitive structure of right and wrong (1st ed.). New York: Routledge.

    Google Scholar 

  • Chang, J., Boyd-graber, J., Gerrish, S., Wang, C., & Blei, D. M. (2009). Reading tea leaves: How humans interpret topic models.

  • Collins, R. (2013). Solving the Mona Lisa smile, and other developments in visual micro-sociology. Working Paper, Department of Sociology, University of Pennsylvania.

  • D’Andrade, R. G. (1995). The development of cognitive anthropology. Cambridge: Cambridge University Press.

    Book  Google Scholar 

  • DiMaggio, P. (1997). Culture and cognition. Annual Review of Sociology, 23, 263–287.

    Article  Google Scholar 

  • DiMaggio, P., & Bonikowski, B. (2008). Make money surfing the web? The impact of internet use on the earnings of U.S. workers. American Sociological Review, 73(2), 227–250. doi:10.1177/000312240807300203.

    Article  Google Scholar 

  • Dimaggio, P., Hargittai, E., Neuman, W. R., & Robinson, J. (2001). Social implications of the internet. Annual Review of Sociology, 27, 307–336.

    Article  Google Scholar 

  • Dimaggio, P., Nag, M., & Blei, D. (forthcoming). Exploiting affinities between topic modeling and the sociological perspective on culture: Application to newspaper coverage of government arts funding in the U.S. Poetics, Page numbers unknown.

  • Douglas, M. (1966). Purity and danger: An analysis of concepts of pollution and taboo. New York: Praeger.

    Book  Google Scholar 

  • Douglas, M. (1986). How institutions think. Syracuse: Syracuse University Press.

    Google Scholar 

  • Eliasoph, N., & Lichterman, P. (2003). Culture in interaction. American Journal of Sociology, 108(4), 735–794.

    Article  Google Scholar 

  • Espeland, W. N., & Stevens, M. L. (1998). Commensuration as a social process. Annual Review of Sociology, 24, 313–343. doi:10.2307/223484.

    Article  Google Scholar 

  • Evans, R., & Kay, T. (2008). How environmentalists “Greened” trade policy: strategic action and the architecture of field overlap. American Sociological Review, 73(6), 970–991. doi:10.1177/000312240807300605.

    Article  Google Scholar 

  • Eyal, G. (2009). The space between fields. Working Paper, Center for Comparative Research, Yale University.

  • Fligstein, N., & McAdam, D. (2011). Toward a general theory of strategic action fields. Sociological Theory, 29(1), 1–26.

    Article  Google Scholar 

  • Foucault, M. (1970). The order of things: An archaeology of the human sciences (1st ed.). New York: Vintage.

    Google Scholar 

  • Franzosi, R. (2004). From words to numbers: Narrative, data, and social science. Cambridge: Cambridge University Press.

    Google Scholar 

  • Franzosi, R. (2009). Quantitative narrative analysis (1st ed.). Thousand Oaks: SAGE Publications, Inc.

    Google Scholar 

  • Gaby, S., & Caren, N. (2012). Occupy online: how cute old men and Malcolm X Recruited 400,000 U.S. users to OWS on Facebook. Social Movement Studies, 11, 367–374.

    Article  Google Scholar 

  • Geertz, C. (1973). The interpretation of cultures: Selected essays. New York: Basic Books.

    Google Scholar 

  • Ghaziani, A. (2009). An “amorphous mist”? The problem of measurement in the study of culture. Theory and Society, 38(6), 581–612. doi:10.1007/s11186-009-9096-2.

    Article  Google Scholar 

  • Ghaziani, A., & Baldassarri, D. (2011). Cultural anchors and the organization of differences. American Sociological Review, 76(2), 179–206. doi:10.1177/0003122411401252.

    Article  Google Scholar 

  • Gieryn, T. F. (1999). Cultural boundaries of science: Credibility on the line (1st ed.). Chicago: University of Chicago Press.

    Google Scholar 

  • Goffman, E. (1963). Stigma: Notes on the management of spoiled identity. New York: Touchstone.

    Google Scholar 

  • Goffman, E. (1974). Frame analysis. Cambridge: Harvard University Press.

    Google Scholar 

  • Gold, M. K. (2012). Debates in the digital humanities. Minneapolis: U of Minnesota Press.

    Google Scholar 

  • Golder, S. A., & Macy, M. W. (2011). Diurnal and seasonal mood vary with work, sleep, and day length across diverse cultures. Science, 333(6051), 1878–1881. doi:10.1126/science.1202775.

    Article  Google Scholar 

  • Gong, A. (2011). An automated snowball census of the political web. SSRN eLibrary. Retrieved from http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1832024.

  • Grimmer, J. (2010). A Bayesian hierchical topic model for political texts: measuring expressed agendas in senate press releases. Political Analysis, 18(1), 1–35.

    Article  Google Scholar 

  • Grimmer, J., & King, G. (2011). General purpose computer-assisted clustering and conceptualization. Proceedings of the National Academy of Sciences, 108(7), 2643–2650. doi:10.1073/pnas.1018067108.

    Article  Google Scholar 

  • Griswold, W., & Wright, N. (2004). Wired and well read. In Society online: The internet in context. New York: Sage.

  • Hopkins, D. (2013). The exaggerated life of death panels: The limits of framing effects in the 2009–2012 health care debate. Working Paper, SSRN.

  • Hopkins, D. J., & King, G. (2010). A method of automated nonparametric content analysis for social science. American Journal of Political Science, 54(1), 229–247. doi:10.1111/j.1540-5907.2009.00428.x.

    Article  Google Scholar 

  • Ignatow, G., & Mihalcea, R. (2013). Text mining for comparative cultural analysis. Working Paper, Department of Sociology, University of North Texas.

  • Johnson-Hanks, J., Bachrach, C., Morgan, P., & Kohler, H.-P. (2011). Understanding family change and variation: toward a theory of conjuctural action. Understanding Population Trends and Processes, 5, 1–179.

    Google Scholar 

  • Kaufman, J. (2004). Endogenous explanation in the sociology of culture. Annual Review of Sociology, 30, 335–357.

    Article  Google Scholar 

  • King, G. (2011). Ensuring the data rich future of the social sciences. Science, 331(11 February), 719–721.

    Article  Google Scholar 

  • Krippendorff, K. H. (2003). Content analysis: An introduction to its methodology (2nd ed.). Thousand Oaks: Sage Publications, Inc.

    Google Scholar 

  • Lamont, M. (1992). Money, morals, and manners: The culture of the French and American upper-middle class. Chicago: University of Chicago Press.

    Book  Google Scholar 

  • Lamont, M. (2000). The dignity of working men: Morality and the boundaries of race, class, and immigration. New York: Russell Sage.

    Google Scholar 

  • Lamont, M. (2012). Toward a comparative sociology of valuation and evaluation. Annual Review of Sociology, 38, 201–221.

    Article  Google Scholar 

  • Lamont, M., & White, P. (2009). The evaluation of systematic qualitative research in the social sciences. Report of the U.S. National Science Foundation.

  • Lan, T., & Raptis, M. (2013). From subcategories to visual composites: A multi-level framework for object recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.

  • Latour, B. (1988). How to follow scientists and engineers through society. Cambridge: Harvard University Press.

    Google Scholar 

  • Lazer, D., Pentland, A., Adamic, L., Aral, S., Barabasi, A.-L., Brewer, D., et al. (2009). SOCIAL SCIENCE: computational social science. Science, 323(5915), 721–723. doi:10.1126/science.1167742.

    Article  Google Scholar 

  • Lewis, K., Kaufman, J., Gonzalez, M., Wimmer, A., & Christakis, N. (2008). Tastes, ties, and time: a new social network dataset using Facebook.com. Social Networks, 30(4), 330–342. doi:10.1016/j.socnet.2008.07.002.

    Article  Google Scholar 

  • Lieberson, S. (2000). A matter of taste: How names, fashions, and culture change. New Haven: Yale University Press.

    Google Scholar 

  • Livne, A., Simmons, M. P., Adar, E., & Adamic, L. (2011). The party is over here: Structure and content in the 2010 election. Proceedings of the Fifth Intenrational AAAI Conference on Weblogs and Social Media, 201–209.

  • Loveman, M., & Muniz, J. (2007). How Puerto Rico became white: boundary dynamics and inter-census racial classification. American Sociological Review, 72, 915–939.

    Article  Google Scholar 

  • Manning, C. D., & Schuetze, H. (1999). Foundations of statistical natural language processing (1st ed.). Cambridge: The MIT Press.

    Google Scholar 

  • Mark, N. P. (2003). Culture and competition: homophily and distancing explanations for cultural niches. American Sociological Review, 68(3), 319–345. doi:10.2307/1519727.

    Article  Google Scholar 

  • Martin, J. L. (2003). What is field theory? American Journal of Sociology, 109(1), 1–49.

    Article  Google Scholar 

  • Medvetz, T. (2012). The rise of think tanks in America: Merchants of policy and power. Chicago: University of Chicago.

    Book  Google Scholar 

  • Merton, R. (1949). Social theory and social structure. New York: The Free Press.

    Google Scholar 

  • Mische, A. (2008). Partisan publics: Communication and contention across Brazilian youth activist networks. Princeton: Princeton University Press.

    Google Scholar 

  • Mohr, J. (1998). Measuring meaning structures. Annual Review of Sociology, 24, 345–370.

    Article  Google Scholar 

  • Mohr, J., & Guerra-Pearson, F. (2010). The duality of niche and form: The differentiation of institutional space in New York City, 1888–1917. In Categories in markets: Origins and evolution (pp. 321–368). New York: Emerald Group Publishing.

  • Mohr, J., Singh, A., & Wagner-Pacifici, R. (2013). CulMINR: Cultural meanings from the interpretation of narrative and rhetoric: A dynamic network approach to hermeneutic mining of large text corpora. Working Paper, Department of Sociology, University of California, Santa Barbara.

  • Mohr, J., Wagner-Pacifici, R., Breiger, R., Bogdanov, P. (2014). Graphing the grammar of motives in National Security Strategies: cultural interpretation, automated text analysis, and the drama of global politics. Poetics, 41(6), 670–700.

    Google Scholar 

  • Moretti, F. (2013). Distant reading. London: VERSO BOOKS.

    Google Scholar 

  • Pachucki, M. A., & Breiger, R. L. (2010). Cultural holes: beyond relationality in social networks and culture. Annual Review of Sociology, 36(1), 205–224. doi:10.1146/annurev.soc.012809.102615.

    Article  Google Scholar 

  • Padgett, J. F., & Powell, W. W. (2012). The emergence of organizations and markets. Princeton: Princeton University Press.

    Google Scholar 

  • Paul, M. J., & Dredze, M. (2011). You are what you tweet: Analyzing Twitter for public health. Fifth International Conference on Weblogs.

  • Quinn, K. M., Monroe, B. L., Colaresi, M., Crespin, M. H., & Radev, D. R. (2010). How to analyze political attention with minimal assumptions and costs. American Journal of Political Science, 54(1), 209–228. doi:10.1111/j.1540-5907.2009.00427.x.

    Article  Google Scholar 

  • Scheufele, D. A. (1999). Framing as a theory of media effects. The Journal of Communication, 49(1), 103–122. doi:10.1111/j.1460-2466.1999.tb02784.x.

    Article  Google Scholar 

  • Sewell, W. (1996). Historical events as transformations of structures: inventing revolution at the Bastille. Theory and Society, 25(6), 841–881. doi:10.1007/BF00159818.

    Article  Google Scholar 

  • Smith, T. (2007). Narrative boundaries and the dynamics of ethnic conflict and conciliation. Poetics, 35, 22–46.

    Article  Google Scholar 

  • Swidler, A. (1986). Culture in action: symbols and strategies. American Sociological Review, 51(2), 273–286.

    Article  Google Scholar 

  • Swidler, A. (1995). Cultural power and social movements. In Social movements and culture. London: Routledge.

  • Tangherlini, T. R., & Leonard, P. (2013). Trawling in the Sea of the Great Unread: sub-corpus topic modeling and Humanities research. Poetics. doi:10.1016/j.poetic.2013.08.002.

    Google Scholar 

  • Tavory, I., & Timmermans, S. (2013). Consequences in Action: A pragmatist approach to causality in ethnography. Working Paper, New School for Social Research.

  • Vaisey, S., & Lizardo, O. (2010). Can cultural worldviews influence network composition? Social Forces, 88(4), 1595–1618. doi:10.1353/sof.2010.0009.

    Article  Google Scholar 

  • Wagner-Pacifici, R. (2010). Theorizing the restlessness of events. American Journal of Sociology, 115(5), 1351–1386.

    Article  Google Scholar 

  • Wallach, H. (2006). Topic modeling: Beyond bag of words. Proceedings of the 23rd International Conference on Machine Learnings.

  • Weber, K. (2005). A toolkit for analyzing corporate cultural toolkits. Poetics, 33(3–4), 227–252. doi:10.1016/j.poetic.2005.09.011.

    Article  Google Scholar 

  • Wuthnow, R. (1993). Communities of discourse: Ideology and social structure in the reformation, the enlightenment, and European socialism. Cambridge: Harvard University Press.

    Google Scholar 

  • Zelizer, V. A. R. (1985). Pricing the priceless child: The changing social value of children. Princeton: Princeton University Press.

    Google Scholar 

Download references

Acknowledgments

I thank Elizabeth Armstrong, Alex Hanna, Gabe Ignatow, Charles Kurzman, Brayden King, Jennifer Lena, John Mohr, Terry McDonnell, Andy Perrin, and Steve Vaisey for helpful comments on previous drafts. The Robert Wood Johnson Foundation and the Odum Institute at the University of North Carolina provided financial support for this research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christopher A. Bail.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bail, C.A. The cultural environment: measuring culture with big data. Theor Soc 43, 465–482 (2014). https://doi.org/10.1007/s11186-014-9216-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11186-014-9216-5

Keywords

Navigation