Dataset on usage and engagement patterns for Facebook Live sellers in Thailand

This article describes a Comma Separated Values (CSV) dataset consisting of 7050 Facebook posts of various types (text, deferred and live videos, images). These posts were extracted from the Facebook pages of 10 Thai fashion and cosmetics retail sellers from March 2012, to June 2018. The dataset was collected via the Facebook API


Specifications
Value of the data • The dataset could serve as a basis for correlation analysis, and principal component analysis on customer engagement in social commerce using a novel sales channel that is video streaming. • Researchers and practitioners in marketing • The data can be further used to characterize exceptionally performing posts that would be statistical outliers in terms of engagement metrics • Live streaming commerce is very well developed in Thailand. Indeed, the country tops the world ranking for the proportion of live streaming domestic viewers [4] . Moreover, Thailand has the World's highest proportion of shoppers buying directly from social media [5] and is considered the most advanced market in conversational commerce whereby people purchase items from businesses via messaging platforms [6] .

Data description
Before the advent of live streaming, statistical studies of customer engagement associated with Facebook posts of different types [ 2 , 7 , 8 , 9 ], considered datasets of status updates, links, videos, and photos with the latter being exclusively of the deferred type. The common observed pattern is that photos were the most commonly used medium and typically generated the most likes and comments, followed by videos. This hierarchy has been observed not only for private brand pages [ 2 , 7 , 8 ], but also for posts on cancer information from a public research organization [9] .
In addition to traditional types of posts, the dataset described in the present paper includes live videos. For each individual post (rows), the columns of the dataset record the type of posts, their date, and engagement metrics comprising shares, comments, and emoji reactions [2] within which we distinguish traditional "likes" from recently introduced emoji reactions, that are "love", "wow", "haha", "sad" and "angry", reflecting more varied sentiments than the more neutral "like" [3] . Descriptive statistics of the engagement metrics per post, for the Facebook pages of the 10 sellers considered in the dataset, are presented in Table 1 . For each seller, this Individual pages considered in the dataset exhibit a wide range of values for engagement metrics, but also three main forms of interaction with their content. Indeed, we can observe that for some pages the primary form of engagement is through likes (e.g. Seller 1), when for other it can be shares (e.g. Seller 5), or likes (e.g. Seller 3). The more recently introduced emoticon reactions ("love", "wow", "haha", "sad" and "angry") exhibit lower values overall because of their unavailability for posts prior to March 2016. However, the dataset shows that live videos (introduced around the same time, in April 2016) generate a high number of 'love'. In fact, the proposed dataset suggests that the introduction of Facebook Live videos drastically changed the statistical distribution of all engagement metrics, for all types of posts, and had a profound effect on the way followers interact with content. This can be observed in the dataset by studying the evolution of these metrics as a time-series. Figs. 1-4 present graphical representations of such time-series for Comments, Shares, Emoticon Reactions, and Likes, respectively.
In these Figures, we can observe dramatically higher averages and maxima for the first three engagement metrics, before and after the introduction of Facebook Live. However, Likes do not appear to undergo the same changes.
The influence of Facebook Live on engagement metrics can be qualitatively explained by marketing theory. Indeed, it suggests stronger feelings and bonding between sellers and viewers that ties with previous findings concerning vividness and interactivity. The two factors of vividness and interactivity are commonly used as a basis for studying the user responses to different forms of online content. Both Pletikosa Cvijikj and Michahelles [7] and Luarn et al. [8] explain    the higher engagement generated by photos with these two dimensions, as classically defined by Steuer [10] . They see vividness as "the extent to which a brand post stimulates various senses ", whereas interactivity is "the degree to which users can influence the form and content of the media environment". Pletikosa Cvijikj and Michahelles [7] conclusion is that "vividness increases, while interactivity decreases the level of engagement over moderator posts, making photos the most appealing post media type" and "providing entertaining and informative content significantly increases the level of engagement". Luarn et al. [8] find that "using social posts is likely to elicit comments and encourage the interaction of users".
In view of these findings, live videos represent a qualitative leap in terms of vividness, where content is as close to real life as online content delivered through a screen can be. The interactivity of live videos is however, limited to some extent and individual viewers do not have much influence over the content (as opposed e.g. to links that have users click, fill forms and follow certain paths in the sitemap of a website). This live video medium also allows live sellers to have a real-time control over the content shared. A live video can be simultaneously entertaining, social, and informative, in reaction to the feedback of the mass of viewers. Therefore, the proposed dataset can serve as a basis for comparative studies on engagement with live videos versus older forms of social media posting and potentially refine marketers' understanding of the impact of vividness and interactivity on customer engagement.
In addition to reproducing the results of the related research article [11] , which investigated the question through Principal Component Analysis, future studies relying on this dataset could attempt to investigate the influence of Facebook Live on engagement, from a time perspective. Indeed, the posts in the dataset being time-stamped, a potentially fruitful line of research could investigate how the observed increase in engagement built up over time following the introduction of Facebook Live, as well as the seasonality of engagement metrics (i.e. which hours in a day, days in a week, months in a year, see more or less engagement?).

Experimental design, materials, and methods
Facebook pages were selected based on their number of followers and activity, using the Facebook Live Map tools. Data were collected through a Python script that makes queries to the Facebook API using the URL pattern, in which "Pagename" is the name of the page of the seller, "StartDate" and "EndDate", respectively the date of the first ever post by a page and the current date and "Token", our access token to the Facebook API. https://graph.facebook.com/v2.9/"Pagename"/posts/?fields = message,link,permalinkurl, createdtime,type,name,id,comments.limit(0).summary(true),shares,likes.limit(0).summary(true), reactions.limit(0).summary(true)&until = "StartDate"&since = "EndDate"&limit = 100&accesstoken = " Token"