Selection and validation of emotional videos: Dataset of professional and amateur videos that elicit basic emotions

This article describes the process of selecting a collection of professional and amateur videos that elicit five basic emotions (i.e., happiness, fear, disgust, anger, and sadness) and validating these videos in three groups of participants (i.e., Chinese from China, Chinese from Malaysia, and Bumiputera from Malaysia). In the video selection phase, professional videos, which were Western movie trailers, were selected from IMDb (Internet Movie Database) and amateur videos were selected from YouTube. The researchers selected videos that display five basic emotions, identified the time frames with the strongest display of emotion, and rated the emotional intensity of each video on a 5-point Likert scale. After the initial stage of selection, two other researchers performed an emotion recognition task by watching the videos without audio to ensure that the emotions can be elicited without understanding the language. This data was used to refine the final selection of 20 professional videos and 20 amateur videos. In the video validation phase, 30 participants were asked to identify and rate the intensity of emotion felt. This article includes a description of the video selection method, a detailed list of the videos selected, and participants' responses and ratings of emotional intensity for the 40 videos.


a b s t r a c t
This article describes the process of selecting a collection of professional and amateur videos that elicit five basic emotions (i.e., happiness, fear, disgust, anger, and sadness) and validating these videos in three groups of participants (i.e., Chinese from China, Chinese from Malaysia, and Bumiputera from Malaysia). In the video selection phase, professional videos, which were Western movie trailers, were selected from IMDb (Internet Movie Database) and amateur videos were selected from YouTube. The researchers selected videos that display five basic emotions, identified the time frames with the strongest display of emotion, and rated the emotional intensity of each video on a 5-point Likert scale. After the initial stage of selection, two other researchers performed an emotion recognition task by watching the videos without audio to ensure that the emotions can be elicited without understanding the language. This data was used to refine the final selection of 20 professional videos and 20 amateur videos. In the video validation phase, 30 participants were asked to identify and rate the intensity of emotion felt. This article includes a description of the video selection method, a detailed list of the videos selected, and participants' responses and ratings of emotional intensity for the 40 videos.  Table   Subject Experimental and Cognitive Psychology Specific subject area Selection and validation of emotional videos Type of data Table  How data were acquired Professional videos were movie trailers from IMDb, an online database of information related to films, and amateur videos were from YouTube, an online video-sharing platform. Video validation data was collected online using Psychopy 3.0 and Pavlovia.

Value of the Data
• This dataset provides researchers with a collection of validated videos (i.e., professionally filmed movie trailers and amateur YouTube videos) and provides insight as to whether differences in cultural diversity (e.g., in China and Malaysia) affect emotion recognition. • This dataset will benefit a range of researchers who investigate emotional or social skills, which includes cognitive psychologists, experimental psychologists, or social psychologists. • The validated videos can be used in emotion-related experiments to understand various factors that affect emotion perception (e.g., facial expressions or body movement, methods of filming, audio or visual information, etc.). • Based on the findings of future research, training programs or apps can be developed to teach and improve emotional and social skills in different settings and populations (e.g., in early education or with individuals with autism).

Data Description
The professional videos were selected from IMDb by identifying scenes that displayed the common emotional triggers defined by Paul Ekman Group [8][9][10][11][12] . The selection method, inclusion and exclusion criteria for professional videos are detailed in Table 1 .  Table 2 includes a detailed list of the 40 professional and amateur videos that were presented in the video validation phase, including the movie names, video links and time frames, the common emotional triggers that are displayed in the videos [8][9][10][11][12] , brief descriptions of the video content, the length of the videos, and the emotional intensity of the videos rated by the researchers.
In the video validation phase, three groups of participants watched the videos, identified the emotion felt, and rated the emotional intensity of each video. The data is reported in Table 3 . The raw data are available in the Mendeley dataset.

Experimental Design, Materials and Methods
Emotion recognition is an important social skill that allows individuals to understand others' mental state and to respond appropriately in different social situations [1][2][3] . Research utilized different types of stimuli to investigate emotion perception, including computer-generated emoticons, photos of posed facial expressions, and videos that more closely represent social interactions [4][5][6][7] . To create a dataset of professional and amateur videos that can be used in emotion perception research, video selection and video validation phases were conducted to identify emotional videos that elicit five basic emotions (i.e., happiness, fear, disgust, anger, and sadness). In the video selection phase, researchers selected emotional videos that displayed the common emotional triggers defined by Paul Ekman Group [8-12] , identified the time frames with the strongest display of emotion, and rated the emotional intensity of each video on a 5-point Likert scale.
For professional videos, Western movie trailers were selected from IMDb, which is an online database of information related to films, by identifying scenes that displayed the common emotional triggers [8][9][10][11][12] . All videos were chosen from IMDb's Best Picture-Winning or Top Box Office. Additionally, inclusion and exclusion criteria were set to ensure that the filming style and genre of all videos are similar (see details of selection method, inclusion and exclusion criteria in Table 1 ). For amateur videos, videos were selected from YouTube, which is an online videosharing platform, by searching for keywords of common emotional triggers [8][9][10][11][12][13][14] . The selected videos were trimmed to less than 60 seconds using the Quick Editing software. Format Factory, which is a multimedia production software, was used to change the format of the videos to MP4. The resolution of the videos is 720 × 480 and the frame rate is 29.97fps.
After the initial stage of video selection, two researchers performed the emotion recognition task, which was presented on the Psychopy 3.0 software and uploaded to Pavlovia. The videos were played without audio to ensure that the emotions can be elicited without understanding the language. Upon completion, this data was used to refine the final selection of 20 professional videos and 20 amateur videos (see Table 2 ).
In the video validation phase, 30 undergraduate and postgraduate students (14 males and 16 females) enrolled in Universiti Malaysia Sabah were recruited. 10 participants were Chinese    from China (6 males and 4 females, mean age 21.1 years, ranging from 19 to 23, SD = 1.197), 10 participants were Chinese from Malaysia (5 males and 5 females, mean age 20.5 years, ranging from 19 to 26, SD = 2.461), and 10 participants were Bumiputera from Malaysia (3 males and 7 females, mean age 20.9 years, ranging from 18 to 23, SD = 1.792). All participants, who had normal or corrected vision, were recruited through convenience sampling and snowball sampling. All participants gave informed consent prior to their participation. The video validation phase was carried out using the Psychopy 3.0 software, which was uploaded to Pavlovia. All participants accessed the videos via a link that was provided by the researcher and performed the emotion recognition task on their respective laptops. The 40 videos were presented randomly, and each video was followed by three questions (i.e., What emotion did you feel after watching the video? How strongly did you feel the emotion? Have you ever watched this video before?). Participants who have watched more than 60% of the videos prior to the video validation phase were excluded from the data as they may have anticipated the content of the videos [14] .
Upon completion, the data on the count and percentage of emotion detected and the rating of emotional intensity for each video were recorded on Pavlovia and then transferred to XLSX worksheet (.xlsx) to calculate the total count and total percentage of detected emotions and the Table 3 The count and percentage of emotion detected and the mean of emotional intensity.   .71 * Videos with the two highest recognition accuracy in each category were selected. If the recognition accuracy was the same for multiple videos, the video with the least confusion with another emotion and/or the video with the highest mean of emotional intensity was selected. mean of emotional intensity (see Table 3 ). In Table 3 , 20 videos (indicated with asterisks) can be used for future emotion-related research. Videos with the two highest recognition accuracy in each category were selected. The recognition accuracy of all videos was higher than 60%. If the recognition accuracy was the same for multiple videos, the video with the least confusion with another emotion and/or the video with the highest mean of emotional intensity was selected.

Ethics Statement
Ethics was approved by the ethical committee of the Faculty of Medicine and Health Sciences, Universiti Malaysia Sabah [approval code: JKEtika 1/19 (20)]. Informed consent was obtained from all participants.

Declaration of Competing Interest
The authors declare that there is no conflict of interest.