figshare
Browse
data.xlsx (28.1 kB)

Twitter hashtag analysis of movie premieres in February 2022 in the USA

Download (28.1 kB)
Version 2 2024-02-07, 11:59
Version 1 2024-02-07, 11:58
dataset
posted on 2024-02-07, 11:59 authored by Víctor YesteVíctor Yeste

Author: Víctor Yeste. Universitat Politècnica de Valencia.

This work is an exploratory, quantitative, and not experimental study with an inductive inference type and a longitudinal follow-up. It analyzes movie data and tweets published by users using the official Twitter hashtags of movie premieres the week before, the same week, and the week after each release date.

The scope of the study is the collection of movies released in February 2022 in the USA, and the object of the study includes them and the tweets that refer to the film in the 3 closest weeks to their premiere dates. The tweets recollected were classified by the week they were published, so they are classified by a time dimension called timepoint. The week before the release date has been designated as timepoint 1, the week of the release date is timepoint 2, and the week immediately afterward is timepoint 3. Another dimension that has been considered is if the movie has domestic production or not, which means that if one of the countries of origin is the United States, the movie is designated as domestic.

The chosen variables are organized in two data tables, one for the movies and one for the collected tweets.

Variables related to the movies:

  • id: Internal id of the movie
  • name: Title of the movie
  • hashtag: Official hashtag of the movie
  • countries: List of countries of the movie, separated by a semicolon
  • mpaa: Film ratings system by the Motion Picture Association of America. It is a completely voluntary rating system and ratings have no legal standing. The currently rating systems include G (general audiences), PG (parental guidance suggested), PG-13 (parents strongly cautioned), R (restricted, under 17 requires accompanying parent or adult guardian) and NC-17 (no one 17 and under admitted)(Film Ratings - Motion Picture Association, n.d.)
  • genres: List of genres of the movie, e.g., Action or Thriller, separated by a semicolon
  • release_date: Release date of the movie in a format YYYY-MM-DD
  • opening_grosses: Amount of USA dollars that the movie obtained on the opening date (the first week after the release date)
  • opening_theaters: Amount of USA theaters that released the movie on the opening date (the first week after the release date)
  • rating_avg: Average rating of the movie

Variables related to the tweets:

  • id: Internal id of the tweet
  • status_id: Twitter id of the tweet
  • movie_id: Internal id of the movie
  • timepoint: Week number related to the movie premiere that the tweet was published on. “1” is the week before the movie release, “2” is the week after the movie release” and “3” is the second week after the movie release.
  • author_id: Twitter id of the author of the tweet
  • created_at: Date and time of the tweet, with format “YYYY-MM-DD HH:MM:SS”
  • quote_count: Number of the tweet’s quotes
  • reply_count: Number of the tweet’s replies
  • retweet_count: Number of the tweet’s retweets
  • like_count: Number of the tweet’s likes
  • sentiment: Sentiment analysis of the tweet’s content with a range from -1 (negative) to 1 (positive)

This dataset has contributed to the elaboration of the book chapters:

  • Yeste, Víctor; Calduch-Losa, Ángeles (2022). Genre classification of movie releases in the USA: Exploring data with Twitter hashtags. In Narrativas emergentes para la comunicación digital (pp. 1012-1044). Dykinson, S. L.
  • Yeste, Víctor; Calduch-Losa, Ángeles (2022). Exploratory Twitter hashtag analysis of movie premieres in the USA. In Desafíos audiovisuales de la tecnología y los contenidos en la cultura digital (pp. 169-187). McGraw-Hill Interamericana de España S.L.
  • Yeste, Víctor; Calduch-Losa, Ángeles (2022). ANOVA to study movie premieres in the USA and online conversation on Twitter. The case of rating average using data from official Twitter hashtags. In El mapa y la brújula. Navegando por las metodologías de investigación en comunicación (pp. 151-168). Editorial Fragua.


History

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC