Quality assessment of videos on social media platforms related to gestational diabetes mellitus in China: A cross-section study

Purpose This study aimed to systematically evaluate the quality of content and information in videos related to gestational diabetes mellitus on Chinese social media platforms. Methods The videos on various platforms, TikTok, Bilibili, and Weibo, were searched with the keyword “gestational diabetes mellitus" in Chinese, and the first 50 videos with a comprehensive ranking on each platform were included for subsequent analysis. Characteristic information of video was collected, such as their duration, number of days online, number of likes, comments, and number of shares. DISCREN, JAMA (The Journal of the American Medical Association) Benchmark Criteria, and GQS (Global Quality Scores) were used to assess the quality of all videos. Finally, the correlation analysis was performed among video features, video sources, DISCERN scores, and JAMA scores. Results Ultimately, 135 videos were included in this study. The mean DISCERN total score was 31.84 ± 7.85, the mean JAMA score was 2.33 ± 0.72, and the mean GQS was 2.00 ± 0.40. Most of the videos (52.6%) were uploaded by independent medical professionals, and videos uploaded by professionals had the shortest duration and time online (P < 0.001). The source of the video was associated with numbers of “likes", “comments", and “shares" for JAMA scores (P < 0.001), but there was no correlation with DISCERN scores. Generally, videos on TikTok with the shortest duration received the most numbers of “likes", “comments", and “shares", but the overall quality of videos on Weibo was higher. Conclusion Although the majority of the videos were uploaded by independent medical professionals, the overall quality appeared to be poor. Therefore, more efforts and actions should be taken to improve the quality of videos related to gestational diabetes mellitus.

globally, and it increases the risk of various short-term and long-term issues for both maternal and fetal health, including obesity, impaired glucose metabolism, and cardiovascular disease [1][2][3][4].Prenatal management of pregnant women with GDM, involving a combination of dietary control, regular exercise, blood glucose monitoring, and medication, is typically concentrated within a relatively short timeframe during pregnancy and is primarily overseen by healthcare professionals in the healthcare system.However, the time and place restrictions often hinder patient compliance with GDM management, thus placing an additional burden on the healthcare system [5,6].Thus, it is essential to recognize that the management of diabetes during pregnancy should not solely rely on the healthcare system [7,8].Improving prenatal self-education and awareness among pregnant women with GDM is crucial.The ever-increasing development of internet technologies offers patients more opportunities to access health information and engage in health communication [9,10].It is common for pregnant women, particularly younger women, to turn to the internet, social media platforms, and smartphone apps for information during their pregnancies [11,12].This trend is particularly prevalent among women aged 19 to 35 in China [13].Web searches cover a wide range of pregnancy-related topics, including fetal development, maternal complications, prenatal care, medication safety, nutrition, and mental health during pregnancy [14].Searches for health information related to the prenatal, perinatal, and postnatal periods are on the rise, and more information is becoming available online [15,16].Despite the wealth of digital resources on social media platforms, the absence of regulations or guidance regarding video content can result in misleading and incomplete information [17].Such inaccuracies and gaps in information can add to the burden of self-education and self-management for the general public's health.Hence, there is a need for quality standards and guidance on social media platforms.To the best of our knowledge, there is limited prior literature that focuses on GDM-related video content in China or the United States.For instance, Kong W et al. reviewed 199 TikTok videos using the coding schema proposed by Goobie et al. and DISCERN criteria.They found that the overall quality of diabetes-related information on TikTok is acceptable, but it might not fully meet the informational needs of patients with diabetes [18].Similarly, Birch EM evaluated 115 unique YouTube videos on GDM in April 2020 using DISCERN and GQS.While some high-quality videos were found, the overall reliability, accuracy, and comprehensiveness were low, and higher quality did not correlate with increased viewer interaction [19].These studies have produced inconsistent conclusions regarding video quality, and there is still a lack of knowledge regarding the content and quality of information about GDM on social media platforms in China.
In China, a multitude of social media platforms are available, but TikTok, Bilibili, and Weibo have emerged as the most popular and essential platforms for Chinese people to access and share health-related information [20][21][22].These platforms enable individuals to conveniently access real-time information through computers, smartphones, and tablets [23,24].While there is a wealth of health-related videos on these platforms, there remains a notable absence of quantitative assessments of the quality of content pertaining to health topics such as GDM, miscarriage, and preterm delivery.This study aimed to systematically assess the quality of content and information within GDM-related videos on these social platforms to provide evidence-based advice and recommendations Q.-Y.Cai et al. to enhance the quality of information available.

Search strategy and data collection
The Chinese keyword, gestational diabetes mellitus, was used to carry out a comprehensive search for popular science videos related to GDM on three Chinese social media platforms: Bilibili, TikTok, and Weibo.Our search encompassed all relevant content available up to October 31, 2022.To ensure impartiality in our search results and quality assessments, we employed newly created accounts on these platforms with no prior search history.The top 50 videos from each platform were selected based on their comprehensive ordering criteria, resulting in a total of 150 videos downloaded for our study.This sample size was determined to be adequate based on prior research [24,25].Upon closer evaluation, we included videos directly related to GDM and excluded non-Mandarin videos, commercial content, videos lacking audio, duplicates, and videos containing disinformation.For consistency, we also excluded videos in languages other than Mandarin and any reposted videos.A detailed flowchart outlining the inclusion and exclusion criteria is manifested in Fig. 1.
The information regarding various video characteristics was gathered, including video duration, the number of days the video had been online, the number of likes, comments, shares, video types, and the region of the author.Additionally, the number of average daily likes, comments, and shares was automatically calculated.Based on prior research pertaining to the authors, we categorized video sources into six distinct types: independent medical professionals, patients or their guardians, medical institutions, news/media outlets, health educators, and independent non-medical users (non-medical professionals).Furthermore, video content was divided into four main categories: general information, education, daily sharing, and case discussion.

Assessment of quality
DISCREN, JAMA Benchmark Criteria, and GQS were used to conduct a quality assessment of the videos [26][27][28].These tools have been widely employed in evaluating the quality of health-related videos on platforms such as YouTube, Facebook, and TikTok [29,30].Additionally, for specialized medical aspects related to GDM, we employed the IADPSG/WHO 2010 criteria [31].
DISCERN is a comprehensive tool for assessing medical information and is among the most commonly used questionnaires.It comprises 16 aggregated questions, each scored on a scale ranging from 1 (poor) to 5 (good) points (Supplementary Table 1).All these 16 questions were divided into three sections.The first section, related to reliability, includes questions 1 to 8, addressing clarity of goals, relevance, and balance.The second section, questions 9 to 15, pertains to treatment, assessing how well it describes the workings of each treatment and illustrates its benefits, risks, and impact on quality of life.The final question, question 16, forms the third part, where users rate the overall quality of the publication as a source of information about treatment choices based on their responses to the previous questions.All videos were divided into five levels based on their total DISCERN scores, including very poor (<27), poor (27)(28)(29)(30)(31)(32)(33)(34)(35)(36)(37)(38), fair (39-50), good (51-62), and excellent (63-75) [32].The JAMA benchmark evaluates online information quality using four distinct criteria: authorship, attribution, disclosure, and currency [27].Authorship necessitates that videos should include details about authors, contributors, and contact information, while attribution demands that references and sources should be listed properly.Disclosure requires conflicts of interest, financing, sponsorship, advertising, support, and video ownership should be disclosed and the requirement of currency is the dates the video was published and updated should be indicated.For each criterion, videos were rated as 0 if the criteria were not met and 1 if they were met.The total scores for these four criteria ranged from 0 to 4 (Supplementary Table 2).The GQS assesses educational value using 5 criteria (Supplementary Table 3).GQS scores range from 1 to 5, with a maximum score of 5 indicating high quality.
Two obstetricians, X L, and J T, independently evaluated all videos.Any differences or disputes between the two reviewers were resolved through discussion with the third author, Q C.

Statistical analyses
In subsequent statistical analysis, descriptive analysis of the data was conducted to provide an overview of both categorical and continuous variables.It's worth noting that none of the continuous variables followed a normal distribution.Nevertheless, we presented the continuous variables as both mean ± standard deviation (SD) and median with the interquartile range (IQR) of 25-75%, which offered a more detailed representation of the data.Additionally, categorical variables, such as the number of videos and video sources, were described in terms of frequencies and percentages.The equivalent nonparametric test, the Kruskal-Wallis H test, and the Bonferroni adjustment were used to analyze the differences between different groups, including video sources, DISCERN classification, and social media platforms.Correlations between the JAMA score, DISCERN score, and video features were calculated with Spearman's correlation coefficient.A P-value <0.05 was considered significant.Notably, no missing value in the variables used in the statistical analysis, and no extrapolation was made on the missing data in the analysis.All analyses were conducted using Statistical Package for the Social Sciences 26.0 (SPSS 26.0,IBM Corporation, Chicago, IL, USA) software.

Ethics Statement
This study concentrated on assessing the quality of videos contributed to and viewed by the public on social media platforms.It is Cai et al. important to note that no clinical data, human specimens, or laboratory animals were utilized in this research.All information used in this study was extracted from publicly available videos on Bilibili, TikTok, and Weibo, and no personal privacy data was involved in the process.Additionally, the study abstained from engaging in interactions with users, thereby eliminating the need for ethics committee approval.

Video characteristics and quality assessment with DISCERN scores, JAMA, and GQS
After screening the 150 videos, 10 duplicate and 5 non-mandarin videos were excluded, resulting in 135 videos being included eventually (Fig. 1).The mean DISCERN total score averaged 31.84 ± 7.85 (median (IQR): 30 (27,34)), while the mean JAMA score was 2.33 ± 0.72 (median (IQR): 2 (2, 3); Table 1).However, GQS proved unable to provide a discriminating or accurate assessment of video quality.The majority of the videos (135 out of 145, or 93%) received a rating of 2, with the remainder being rated as 1 (3 out of 145, or 2%), 3 (4 out of 145, or 3%), or 4 (3 out of 145, or 2%).Consequently, GQS was excluded in the subsequent video quality and correlation analysis.The analysis of the regional distribution of authors indicated that the highest number of videos were uploaded from Henan, followed by Beijing and Guangdong (Fig. 2).
Further analysis showed that videos uploaded by independent medical professionals had significant correlations with the DISCERN reliability scores (P = 0.006) and JAMA scores.Data revealed that videos uploaded by medical institutes (median (IQR): 20 (16.5, 25.5)), health educators (median (IQR): 19 (18,21)), and news/media (median (IQR): 19 (16,20)) had higher DISCERN reliability scores than those by independent medical professionals (median (IQR): 19 (17,21)).Similarly, the JAMA score for both medical institutes (median (IQR): 3 (3, 3)) and health educators (median (IQR): 3 (2, 3)) was higher than that for independent medical professionals (median (IQR): 2 (3, 2)).However, there were no significant differences in DISCERN treatment scores, quality scores, or total scores among the various video sources (Fig. 3).Unsupervised hierarchical clustering revealed that very few videos achieved high scores, especially in questions related to risk, benefits, treatment mechanisms, source, information date, and reference to areas of uncertainty.After unsupervised hierarchical clustering, detailed results of the JAMA score, as well as the divided groups by video source shown in Fig. 4. The specific criteria for currency were met in the vast majority of videos (98%).However, only a few videos satisfied the criteria of attribution (8%).It is noteworthy that videos uploaded by patients or guardians often lacked authorship, and over 70% of videos by independent medical professionals failed to disclose potential conflicts of interest.

Distribution of DISCERN classification
The data concerning the distribution of DISCERN classifications revealed that 19.3% of the videos fell into the "very poor" category, while 67.4% were categorized as "poor."Additionally, 8.9% were classified as "fair," 3.7% were rated as "good," and only 0.7% achieved the distinction of "excellent" (as indicated in Table 3).Furthermore, it's noteworthy that the duration of videos in the "good" and "excellent" categories was longer than those in the other classifications (P < 0.001).However, there were no significant differences between the groups in terms of any other video characteristics.

Video characteristics on different social media platforms
Ultimately, this study included 45 videos each from TikTok, Bilibili, and Weibo, allowing us to conduct a more in-depth analysis.It was observed that videos uploaded on Bilibili had the longest duration (median (IQR): 199 (94, 407)) and were online for the greatest number of days (median (IQR): 525 (122.5, 934.5)).In contrast, TikTok videos had the shortest duration of being online (median (IQR): 65 (48.5, 91.5)).Interestingly, videos on Bilibili seemed to garner more "likes" (median (IQR): 1658 (1207, 6185)), "comments" (median (IQR): 823 (316.5, 1537.5)), and "shares" (median (IQR): 664 (277.5, 1405)) compared to TikTok and Weibo (P < 0.001;  Table 4).Furthermore, Bilibili videos obtained a higher DISCERN reliability score compared to Weibo and TikTok (Fig. 5A), while there were no differences in DISCERN treatment and quality (Fig. 5B and C).The median DISCERN total score for videos on Bilibili was also higher than that of Weibo and TikTok (Fig. 5D).The median DISCERN treatment score and quality exhibited no significant   differences between the three social media platforms.For a more detailed breakdown of DISCERN classification and JAMA scores, please refer to Fig. 5E and F.

Correlation of factors influencing of DISCERN score and JAMA score
To minimize the variability in our assessment results, we employed both the DISCERN and JAMA instruments for evaluating video quality.Spearman's test uncovered correlations between factors influencing DISCERN scores and JAMA scores.The analysis revealed a significant positive correlation between the DISCERN total score and the video duration (r = 0.417, P < 0.001).Furthermore, the JAMA score also exhibited a significant positive correlation with the video duration (r = 0.223, P = 0.009).Notably, the DISCERN total score was significantly connected with the JAMA score (r = 0.558, P < 0.001; Table 5).

Table 4
Video characteristics of included videos in different social media platforms.

Motivation and implications of this study
With the development of the internet and the surging popularity of social media, these platforms have become a major source of entertainment and information for people, garnering substantial attention.As vast repositories of information, various social media platforms have emerged as pivotal channels for disseminating health-related information and providing the general population with access to medical and health knowledge.Notably, platforms like TikTok, Bilibili, and Weibo offer a more direct and convenient means of access to such information for the people of China.However, prior research has highlighted a concerning trend: approximately twothirds of medical videos are deemed unsatisfactory, with one-third containing inaccuracies [18,33].This seriously affects the correctness and effectiveness of health information dissemination to the public.The motivation behind this study stems from personal experiences, as the author was confronted by questions from both relatives and patients who pointed out inconsistencies between information they found on platforms like Bilibili and TikTok and the information the author provided.These interactions sparked the author's contemplation and served as a source of inspiration.The information gleaned from videos on social media platforms can be likened to a double-edged sword, posing both opportunities and concerns.
Previous studies have shown that information targeted at pregnant women on social platforms is readily accessible and garners significant views, but it is often characterized by low quality and reliability [34].GDM, as one of the most prevalent pregnancy complications, is associated with adverse pregnancy outcomes [35,36].It is no wonder that high-quality care and management are crucial for women with GDM to benefit pregnancies [37].Consequently, it becomes essential to conduct a quantitative assessment of internet videos in China specifically pertaining to GDM, a task that has not been previously undertaken.

Major findings
Upon conducting a thorough assessment using DISCERN and JAMA, it became evident that the quality of all types of videos was consistently low.The mean scores of DISCERN and JAMA were 31.84/75 and 2.33/4, respectively.In terms of DISCERN classification, a significant 86.7% of the videos fell into the "poor" or "very poor" quality categories, with merely 4.4% being categorized as "good" or "excellent."While most videos had clear objectives and successfully achieved their goals, over two-thirds of the videos lacked critical information such as video source attribution, current upload dates, and detailed treatment information.Furthermore, more than half of the videos failed to provide comprehensive treatment descriptions, including explanations of how treatments work and their associated benefits.These findings underscore the low quality and lack of reliability in these videos.JAMA scores showed that almost more than 90% of videos have currency, but only four videos contributed the attribution to the public.What worries people about the video quality on social platforms for the first time is that Keelan et al. found that 38% of analyzed YouTube videos were against vaccination, but the average star rating and views were higher than those supporting vaccination [38].This observation has led to a growing body of research focused on the subpar quality of videos on social media platforms [19].
The majority of videos were uploaded by authors from Henan, Beijing, and Guangdong, located in the south, north, and middle of China, respectively.It remains a question whether the number of videos is associated with regional distribution or regional development, a topic that warrants further discussion.Information from video sources manifested that more than half videos were uploaded by independent medical professionals, but the total quality is poor.Although the videos uploaded by independent medical professionals had less duration, they were likely to receive more public attention, such as likes, comments, and shares, which was consistent with the results of previous similar studies [39,40].Owing to most professionals were fond of dividing videos into multiple sections and uploading a series of videos or focusing on one of the related topics to share.Moreover, such results are worth our worrying and consideration: the quality of health information provided by experts does not meet the expectations of the public and needs to be improved.
Many viewers tend to initially select videos with higher popularity, anticipating reliable and pertinent information from specialists.
However, Xue et al. revealed a disconcerting fact: among their sample of 61 analyzed videos, over 33% contained unequivocal misinformation [23].To make matters worse, a positive correlation between the presence of misinformation and the number of views.Previous studies pointed out that popularity did not necessarily equate to higher quality, and higher-quality videos were not consistently among the most popular ones [24].Consequently, the number of likes, comments, and shares a video received from the public did not consistently correlate with its quality or source.It was observed that videos with longer durations and more detailed information tended to receive a higher DISCERN classification.In contrast, TikTok had the shortest duration, but with the most popular and lower DISCERN classification, which suggested the number of likes, comments, and shares were not related to the DISCERN classification.It highlights the fact that the public possesses a certain degree of discernment and does not always unquestioningly embrace high-quality health information on social media platforms.These findings indicated the significance of the ability of public self-identification and the level of health self-education.Eventually, spearman's tests showed the DISCERN scores had a significant correlation with JAMA scores, emphasizing the pivotal role and scientific validity of the DISCERN and JAMA instruments in the evaluation of these videos.Furthermore, this showed the authors used the DISCERN and JAMA instruments correctly to evaluate the videos through their expertise.

Expectations
GDM is among the obstetrical complications that have the most adverse impact, particularly when coupled with other complications [41,42].Social media platforms are increasingly playing a critical role in public health engagement.However, the quality of videos on these platforms often falls short [43].Therefore, it is essential for video uploaders, especially medical professionals, to offer a more comprehensive and accurate portrayal of information while avoiding any misleading or incomplete content.Based on the findings of our study, we recommend that video uploaders proactively include vital details such as the video's source, the date of its upload, and comprehensive treatment information.Additionally, social media platforms themselves bear a regulatory responsibility for ensuring the quality of videos available to the public.It is incumbent upon us to provide constructive suggestions for the use of social media platforms and encourage government involvement in the regulation of health-related information on these platforms, to a reasonable extent.

Limitations
Although our study is the first to investigate the content and quality of GDM-related videos on social media platforms in China, it does come with a few limitations.Firstly, we only considered videos from China, potentially overlooking valuable content in other languages.Although our primary focus was on Chinese videos, it's worth acknowledging that videos in other languages may offer more high-quality information and a more comprehensive perspective.Secondly, despite using new accounts for our searches, it's important to note that search algorithms can yield different results based on factors like location, user behavior, and other unknown variables in a dynamic system.Finally, this study focused on the most frequently used and most popular social media platforms, but the contribution of other platforms to public health information remains to be investigated.

Conclusion
In conclusion, this study evaluated the information quality of 135 GDM-related videos on Chinese social media platforms.The findings revealed that the majority of these videos were uploaded by independent medical professionals, but regrettably, their overall quality was lacking.Of particular concern is the fact that these videos often offered incomplete and potentially misleading information, particularly in the context of treatment.It is imperative to enhance collaboration with professionals and social media platforms and enhance the quality of GDM-related videos.The public should exercise caution when seeking GDM-related information on these platforms and actively promote their self-education.

Fig. 1 .
Fig. 1.The flow chart of this study.

Fig. 2 .
Fig. 2. The distribution of video authors in China.

Fig. 3 .
Fig. 3. DISCERN score of the videos.Unsupervised hierarchical clustering was conducted based on DISCERN score items (rows) and individual videos (columns, n = 135).The categorical item scoring ranges from 1 (not addressed/fulfilled) to 5 (fully addressed/fulfilled), and the video group is indicated in the top row of the heatmap.

Fig. 4 .
Fig. 4. JAMA score for videos.Unsupervised hierarchical clustering was performed based on JAMA score items (rows) and single videos (columns, n = 135).The categorial item scoring is either 1 (addressed/fulfilled) or 0 (not addressed/fulfilled), and the video group is represented in the top row of the heatmap.

Fig. 5 .
Fig. 5. Video characteristics on different social media platforms.(A-D) DISCERN score comparison of videos on different social media platforms.(E) Distributions of DISCERN classification.(F) Distributions of JAMA scores.

Table 1
Video characteristics of included videos.

Table 2
Video characteristics and quality assessment according to video sources.

Table 3
Distribution of DISCERN classification according to video characteristics and source.