Robotic colorectal surgery: quality assessment of patient information available on the internet using webscraping

Abstract The primary goal of this study is to assess current patient information available on the internet concerning robotic colorectal surgery. Acquiring this information will aid in patients understanding of robotic colorectal surgery. Data was acquired through a web-scraping algorithm. The algorithm used two Python packages: Beautiful Soup and Selenium. The long-chain keywords incorporated into Google, Bing and Yahoo search engines were ‘Da Vinci Colon-Rectal Surgery’, ‘Colorectal Robotic Surgery’ and ‘Robotic Bowel Surgery’. 207 websites resulted, were sorted and evaluated according to the ensuring quality information for patients (EQIP) score. Of the 207 websites visited, 49 belonged to the subgroup of hospital websites (23.6%), 46 to medical centers (22.2%), 45 to practitioners (21.7%), 42 to health care systems (20,2%), 11 to news services (5.3%), 7 to web portals (3.3%), 5 to industry (2.4%), and 2 to patient groups (0.9%). Only 52 of the 207 websites received a high rating. The quality of available information on the internet concerning robotic colorectal surgery is low. The majority of information was inaccurate. Medical facilities involved in robotic colorectal surgery, robotic bowel surgery and related robotic procedures should develop websites with credible information to guide patient decisions.


Introduction
In the previous twenty years, the internet has become a widely used tool by patients to gather medical information.The influx of patients gathering information online further triggered a rise in the number of websites providing healthcare information.However, the quality of data found on different websites varies since some articles contain high-quality information guided by research and evidence while others have lowquality information steered by assumptions and for advertising purposes.Online health information can transform how patients acquire medical services by challenging the traditional methods that prevailed before the age of the internet.Robotic colorectal surgery involves the utilization of a robotic platform to perform surgical procedures.This approach gained momentum in 2001 after the da Vinci robot made its first appearance in the medical sphere [1].The introduction of da Vinci robotic-assisted surgery helps with challenges using laparoscopic and even traditional open surgical approaches.However, robotic surgery could increase cost and lead to longer operative times compared to laparoscopic approaches in colorectal surgery, at least in the early learning phase [2].Nonetheless, robotic utilization in colorectal surgery has continued to increase and has become a standard surgical approach.Robotic technologies allow visualization in 3-dimensions, increased magnification, improved range of motion of surgical instruments, and tremor suppression, which permits more precise dissections [3,4].The advantage of this approach, like laparoscopic methods, are smaller incisions, earlier return of bowel function, lower complication rates, and decreased length of stay [5].Even though robotic colorectal surgery can slightly increase operative times, it offers the same advantages as traditional laparoscopy and lowers conversion rates to open surgery [6].With these advantages, robotics has provided additional benefits to users and surgeons treating various pathologies such as rectal cancer [7].
Despite all these positive effects of this technology, a big disadvantage is that these purchases are so expensive and therefore an acquisition with a low number of cases per year is not profitable for some hospitals.
Furthermore, robotics provides an enhanced camera platform, articulated instruments and tremor reduction that eliminates many forms of human error [2].As an outcome, patients benefit from decreased operative blood loss [8].Robotic colorectal surgery can reduce infection rates, partially secondary to less tissue manipulation than traditional approaches [9].Physical surgeon contact only occurs during port insertion, specimen extraction, and closure.The minimal tissue manipulation provided by this method improves the safety of surgical procedures and minimizes risks and complications compared to traditional approaches.
Compared to minimally invasive surgery, the use of surgical robots lowers the rate of conversion to open surgery [10].It also allows the surgeon to have a wider range of motion (up to 360 degrees), also in small spaces.
Hospitals must invest significantly to provide crucial education on robotic platforms to guarantee maximum performance.The advancement of technology is inevitable, which implies that current equipment will be replaced in the future with more advanced approaches, hence incurring more costs.Despite the presence of many articles discussing robotic colorectal surgery, there is minimal information detailing the quality of patient information concerning this topic.The healthcare information available on various websites is crucial in influencing patients' decisions concerning their condition and the treatment they adopt.Nevertheless, website information is often uncontrolled and undependable.Thus, the availability of high-quality information concerning robotic colorectal surgery may trigger an increase in awareness of the disease and the deployment of the most appropriate methods of surgery for enhanced outcomes.In addition, high-quality information advances informed consent while simultaneously encouraging patient-centered care Most of the articles concentrate on the utilization of the robotic approach during colorectal surgery.The primary reason for the development of robotic colorectal surgery arose from the challenges presented by the traditional laparoscopic approach.Laparoscopy does not offer 3-dimensional viewing, and has limited instrument movement and articulation, with additional limitations in surgeon-dependent camera maneuverability and tissue retraction [11].Therefore, the introduction of the robotic system aimed to improve these problems.For instance, Antoniou et al. described the system as having three parts: the computer console, the robot tower, and the video card, allowing surgeons to be more flexible and comfortable increases [11,12].
A case study spearheaded by Morelli et al. revealed that using robotic platforms in colorectal surgeries reduced surgical complications and lowered conversion rates to open surgery [13].
This showcases that surgeons realize the advantages of robotic systems, such as triangulation in narrow surgical fields.In a similar perspective, Ngu, Tsang, and Koh highlighted the capabilities of the Da Vinci Xi robotic system [14,15].The system comprises rotating arms and has an improved and simplified docking process.The high acceptance of robotic systems signifies that many hospitals consider using this approach when performing colorectal surgery.Comparing the Xi system to older models like Da Vinci Si reveals that the former is more efficient and reduces the time of an operation [16].Also, the robotic Xi robot triggers have been shown to reduce bleeding and postoperative complications [17,18].As an outcome, surgeons have adopted this approach to deal with complicated conditions like rectal cancer, where working space is limited.
The preference for minimally invasive procedures by both surgeons and patients has encouraged the development of robotic systems.Among recent developments are the robotic single-site method, which permits surgeons to handle the technical limitations of laparoscopy better [19].A review by Bae et al. determined the main advantage of the single-site platform is it encourages regular triangulation, thereby making it easier for surgeons to perform colorectal surgeries [20].Therefore, it remains evident that the robotic system provides benefits to both surgeons and patients undergoing colorectal surgery [21].The availability of this information on the internet implies that patients, particularly those from low-income households, can acquire sufficient and accurate data on the subject.However, individuals must confirm this information with a physician to guarantee that the technology is available in their location.
Failure to do this leads to patient misinformation.Consulting a medical professional will permit a detailed discussion of the information acquired from the internet and allow surgeons to help clients differentiate facts from myths.
The primary goal of this review is to assess patient information on the internet concerning robotic colorectal surgery.This review will collect current information on this concept to understand what data patients currently have access to.Little is known about the quality of information available to patients on the internet regarding robotic colorectal surgery.As the internet is one of the most important first and convenient sources of information for patients facing robotic colorectal surgery, we aim to systematically examine the information provided by online websites and to evaluate the quality and consequences of this information.The search term 'robotic colorectal surgery' returns approximately 3.000.000results on the Google search engine and 577.000 results on Bing and Yahoo, suggesting significant patient interest and multiple sources of information for this procedure.Because information provided online can significantly change patients' opinions, we aimed to evaluate the quality and impact of this information.

Data evaluation
Data was collected in November 2021 using a webscraping algorithm with the Python packages Beautiful Soup (Version 4.10) and Selenium (Version 4.0).Institutional Review Board approval was not necessary for the study.The Beautiful Soup is a free program library for screen scraping.The software, written in Python, can be used to parse XML and HTML documents [22].Selenium is a powerful tool for controlling web browsers through programs and performing browser automation [23].Data collection was conducted using the three most popular search engines Google, Bing, and Yahoo [24].The search terms of 'Da Vinci Colon-Rectal Surgery', 'Colorectal Robotic Surgery' and 'Robotic Bowel Surgery' were selected to find relevant websites.In doing so, the program went through the initialized keywords and extracted the Uniform Resource Locators (URL's) found.The searches have been done on a desktop computer.The first 150 unique results have been retrieved from Google, followed by 150 unique websites from Yahoo and Bing.The sample of 150 websites was chosen arbitrarily.Here, we argue that individuals will restrict their search to less than 150 hits per search engine.All 450 web pages were scraped in English and 163 websites that have been duplicated between the search engines have been excluded.To do so, the python script compared the three lists of collected websites and removed the duplicates.Then, two English-speaking reviewers checked whether the websites were in English or not.We excluded web pages from the 287 eligible websites that were irrelevant and used the long-tailed keywords in a different context.An additional 39 were excluded after two investigators checked the websites manually again.Additionally, 41 were excluded as they were academic or scientific pages such as professional sites, scientific articles, or journals, based on the assumption that these are not used by patients for gathering information about a surgical approach [25].Two investigators investigated the remaining 207 eligible websites to verify the results.Finally, websites were categorized into the following groups: Hospitals, medical centers, practitioners, news services, patient groups, web portals (none of the other categories), and the industry which was defined as a group of people who are using robotic approaches in surgery and the health care system.

Web scraping algorithm
Web scraping describes the process of extracting information automatically from the internet.Web scraping consists of creating a self-running script that requests a web server, queries data, and then analyzes the data to retrieve the desired information [26].In this process, the generated web scraping algorithm reads out the first 150 websites and their URLs of the three search engines with the defined keywords.Next, the scraper goes through all pages of the respective URLs and extracts data such as type, origin, and year of publication of the websites using regular expressions for search in the text.Finally, the scraped data can be downloaded in a specific format.As mentioned, the developed web scraping algorithm was applied in this work using the Python packages Beautiful Soup and Selenium.Here, Beautiful Soup was created specifically for web scraping, where XML and HTML documents can be parsed [22].The Selenium package was applied for the interaction between Python and the web browser.The supported browsers are Firefox, Google, and Internet Explorer.

Patient information Assessment tool
For rating webpages, the modified ensuring quality information for patients (EQIP) instrument was used, which contains 36 items for evaluating the content and structure of an information source for patients [27].In the original work, the EQIP instrument consisted of a four-point rating scale with the options 'yes', 'partially yes', 'no', and 'NA' [27].However, like previous studies, we modified this approach and examined only the binary scales yes and no.The criteria of the tool can be therefore related to the content (Item 1 -Item 18), identification (Item 19-Item 24), and structure of data (Item 25 -Item 36).Unlike the other assessment tools such as the Health Educator Center tool, the DISCERN tool, and the JAMA benchmark score, the EQIP accounts for the quality of the content, the readability, and the design of the written text, which makes the tool more reliable for assessing the outcomes [28,29].For this reason, the EQIP has been applied in several studies [30][31][32].The evaluation of the websites using the EQIP tool was carried out by five experts.Any disagreements were resolved by consensus.An inter-rater reliability of the j-statistic of 0.676 and an intra-class correlation coefficient of 0.667 was obtained, indicating substantial agreement among the experts.The search items were arbitrarily chosen.

Statistical methods
Using a binary scale, the total EQIP score results can be found by specifying how many times the expert rated the item as indicated on each website.(Table 1) Thus, an EQIP score of 0 to 36 can be assigned for each item.The items themselves are equally weighted.For statistical analysis, the language Python (Version 3.9) was used (Pandas Version 1.3, NumPy Version 1.21, Matplotlib).Categorical variables were compared with Fischer's test and the student's t-test was used for comparing continuous variables.The p-value < 0.05 was considered to be statistically significant and all tests were two-sided.We also identified high-scoring websites by arbitrarily utilizing the third quantile as a threshold to distinguish between high-scoring and lowscoring websites.In this process, websites that achieved an EQIP total score of more than 21.5 were classified as high-scoring websites based on the overall median of the collected data.Correspondingly, low-scoring websites had an EQIP total score of 21.5 or less.

Website research
287 websites in English containing the long-tailed keywords mentioned above sourced from Google, Bing and Yahoo searches were retrieved using a web scraping algorithm.After excluding 39 irrelevant websites and 41 scientific articles intended for scientists, 207 eligible websites underwent statistical analysis (see Figure 1).

Overall quality of robotic colorectal surgery
The overall median of the websites is 20 (IQR 15-21.5)from which websites with a score of 21.5 (75th percentile) were considered to have a high rating.Out of the 207 retrieved websites, 52 (25%) have a high score, whereas the 155 (75%) remaining websites had a low rating.
To visualize the source of information, a boxplot is shown in Figure 2. The box plot presents all the websites and their scores based on their field of information, where the upper line of the boxplot represents the 75th percentile while the lower represents the 25th percentile.Of the 207 websites identified, 49 belonged to the subgroup of hospital websites (23.6%), 46 to medical centers (22.2%), and 45 to practitioners (21.7%).Furthermore, 42 health care systems (20,2%), 11 news services (5.3%), 7 web portals (3.3%), 5 industry (2.4%), and 2 to patient group websites (0.9%) were classified.
Having calculated the scores for all sources of information, a few categories were able to outperform the median, namely the industry (the highest-scoring information source in this listing), followed by news services and hospitals.Web portal sites obtained the lowest scores.

Content data
According to Charvet-Berard and colleagues, items 1 -18 describe the content of data 'see Table 1' [14].The top items in this category with more than 85% of websites consisted of item 1: Initial definition of which subjects (94%), item 2 (Coverage of the previously defined, 89%), and item 3 (Description of the medical problem, 85%).Infrequent items (<15%) were item 8 (description of the quantitative benefits to the recipient, 13%) and item 10 (description of the quantitative risks and side effects, 13%).

Identification data
Identification of the data is categorized into items 19-24 (Table 1).Overall, fewer points were awarded for the items compared to the content data category.The top items in the data identification category were item 20 (Logo of the issuing body,64%) and item 21 (Name of the persons or entities that produced the document, 53%).Item 2 (short bibliography of the evidence-based data used in the document) was found   only once on a website.The least frequently rated items were item 22 (names of the persons or entities that financed the document, 26%) and item 24 (statement about whether and how patients were involved/consulted in the document's production, 31%).

Structure data
The structure of the data is summarized in items 25 -36 (Table 1).Top-rated items with more than 75% of all websites are item 29 (Respectful tone, 84%), item 30 (Clear information, 82%), item 32 (Presentation of information in a logical order, 84%), and item 33 (Satisfactory design and layout, 80%).However, item 34 (Clear and relevant figures or graphs, 10%), and item 35 (Inclusion of a named space for the reader's notes or questions, 15%) were not frequently rated.Thereby, item 26 (use of generic names for all medications or products) was only found on four websites

Comparison of high-score and low-score websites
As explained, high-scoring websites were also compared with low-scoring websites using the third quantile threshold.Overall, 16 out of 36 items were rated significantly more frequently on high-scoring websites than on low-scoring websites, as shown by the p-value.(Table 1) For example, item 5 was rated as given on 94% of all high-scoring websites, while item 5 was rated as met on only 48% of all low-scoring websites.

Top-Ranked websites
Top-rated websites, those with more than 23 EQIP points (arbitrarily chosen 95th percentile), were further inspected.Here, the top-rated website received 27 points and was published by a hospital.In fact, hospitals were the source of information on six of the ten top-rated websites.In two cases, the top-ranked websites were from practitioners, and in one case, from a medical center.

Country of origin
The country of origin of the published websites was examined in the following subsection.Figure 3 shows that 187 (90.3%) of all websites were published in the United States (U.S.).Within the U.S., most websites originated from the states of Arizona 61 (32.6%),Florida 29 (15.5%),California 12 (6.4%),and New York 12 (6.4%).The remainder of the U.S. websites were distributed among the other U.S. states.In addition, 5 (Great Britain, 2.4%), 3 (Australia, 1.4%), and 3 (Canada, 1.4%) websites were found in native English-speaking countries.From India, 2 (0.96%) websites related to robotic colorectal surgery were posted.Finally, several countries (Argentina, China, Italy, Germany, Mexico, New Zealand, and Singapore, each with 0.48%) were identified from which a website was released.

Year of website publication
The EQIP score achieved was plotted against the website's year of publication using a smoothed regression and a scatter plot (Figure 4).The horizontal line illustrates the determined EQIP median of 21.5 across all websites surveyed.In general, it can be seen that there is a slight increase in the EQIP from 2019 to 2022, indicating that the quality of websites is improving over time.
The overall results of the modified EQIP tool for the three categories of content, identification, and structure data revealed the following.First, within the content data category, the top items were the Initial definition of which subjects (94%), Coverage of the previously defined (89%), and Description of the medical problem (85%).It can be confirmed that most websites included the criteria of 'the initial definition of which subjects,' 'the coverage of the previously defined,' and 'description of the medical problem.'In the identification data category, the top items were item 20 (Logo of the issuing body, 64%) and item 21 (Names of the persons or entities that produced the document, 64%).Thus, two out of the three websites contained the criteria of 'logo of the issuing body' and 'names of the persons or entities that produced the document.'Finally, in the structural data category, the top-rated items were item 29 (Respectful tone, 84%), item 30 (Clear information, 82%), item 32 (Presentation of information in a logical order, 84%), and item 33 (Satisfactory design and layout, 80%).This implies that four out of five websites met the criteria of 'respectful tone,' 'clear information (no ambiguities or contradictions),' 'presentation of information in a logical order,' and 'satisfactory design and layout (excluding figures or graphs).'

Discussion
There is limited data on the quality of websites dedicated to robotic colorectal surgery.In this study, five experts analyzed 207 websites with content on robotic colorectal surgery using a modified EQIP tool.The analysis showed that the five experts achieved a high agreement (with respect to the intraclass correlation coefficient and the j-statistic).In doing so, 155 websites received a low EQIP score, indicating low information quality.Consequently, 52 websites were given a high score (arbitrarily using the third quantile as a threshold).The median EQIP score of all websites was 20 (IQR 15-21.5),confirming the theory of low information quality of robotic colorectal surgery websites.However, information quality differed significantly among the lowest-scoring data sources, such as web portals.On the other hand, hospitals, news services, and the industry conveyed high-quality content.This can be confirmed when looking at the highest-rated websites, where six of the ten highest-ranked websites were created by hospitals.
An important hypothesis we assumed in this work was that EQIP items should be scored significantly more often on high-scoring websites than on lowscoring websites.Overall, we were able to predict that 16 of the 36 items were rated significantly more often on high-scoring websites compared to low-scoring websites.In addition, this study examined the origin and the year of the publication.It was found that most websites were published in the United States (187), followed by Great Britain (5), Australia (3), and Canada (3).The rest of the websites were published outside of native English-speaking countries.Looking at the year of the publication, it was detected that most websites were updated in 2022.Moreover, a slight improvement in the EQIP score (and therefore the website quality) can be seen over time according to the applied smoothed regression.
Furthermore, it is interesting to see that all of the best websites which got scraped and evaluated well by professionals (see Table 2) do not appear in the very first few websites after using the search engines.This indicates that the order in which the web pages appear in the various search engines is not designed to provide the best and most accurate answer to the searcher, but is based on other criteria such as advertising, use of buzzwords, etc.This phenomena can be explained by search engine optimization (SEO) strategies followed by the website authors.There are two main trends, the 'white hat SEO' and the 'black hat SEO' [33].The former aims to rank high by using ethically accepted practices and the other one abuses the scoring criteria of the search engine to score high.
This research is a comprehensive evaluation of the information quality on the internet relating to robotic colorectal surgery.A survey by Wasserman et al. used the DISCERN tool to evaluate online data quality [34].The paper's results indicated that the present information on colorectal cancer was variable, incomplete, and failed to convey data that can guide patients into reaching a well-informed conclusion about treatment options available when managing colorectal cancer.These findings are consistent with the results of this study as most websites included in the current study were unreliable.Another important issue which is related to that and which should be addressed is that there is a lack of education among a lot of patients, therefore making it hard for them to discern reliable data from 'click bait' sites.
In today's society, many patients utilize the internet for information on various medical topics.A cross-sectional study by Bianco et al. (2013), revealed that 83 percent of internet users obtained health-associated data online for themselves, family, or acquaintances [25].This implies that the internet is an extensive and unregulated platform that permits anyone to obtain healthcare information.Since authors of website articles are not required to comply with a formal quality control procedure, they often post data that does not have sufficient accompanying evidence, or simply does not rely on comprehensive research.For example, people may post information gathered from a small study population, implying that the findings may not represent the wider population, thus confusing consumers.Moreover, such a situation exposes patients to the consequences of acquiring inaccurate data, which adversely affects their ability to make appropriate decisions concerning their condition and/or treatment.Therefore, it remains crucial for various medical institutions to collaborate and educate patients on the type of information they should trust and how to assess the validity and quality of the data.Likewise, patients should ask for the assistance of medical professionals to process information found on the internet since some blogs publish inaccurate data with malicious intentions.
In the current study, we evaluated the quality of websites using the EQIP tool.Other tools available for use include the DISCERN and the Health Educator Center Tool.Despite the prevalence of multiple quality analysis tools, very few of them have undergone testing to guarantee their reliability.After evaluating and comparing the tools against one another, we elected to use the EQIP tool due to its ability to analyze the format of the presentation and the content of the publications.The EQIP tool was superior for the research design applied than DISCERN secondary to EQIP being able to distinguish between information of poor and high quality and correlate these findings with other measures of information quality [35].The range of objectives able to be evaluated by the DISCERN tool is smaller than the EQIP tool [35].This was evidenced in a study by McCool et al.where the investigators identified EQIP as a validated approach for analyzing the comprehensibility, design, and excellence of written data [36].Since its development, EQIP has shown that it is reliable and reproducible [37].

Limitations
There are limitations of the study worth discussing.First, as technology progresses and website quality is presented, the relevance of the study may decrease over time.Therefore, the outcomes of this research project represent current data on the internet concerning robotic colorectal surgery in 2021.Second, one could argue that not all items of the EQIP score should contribute equally to the final score.Therefore, it may be beneficial to modify it based on the concrete problem that needs to be analyzed.Third, the cutoff for high-vs low-scoring websites was arbitrarily chosen (95th percentile of scraped websites).It may be that the actual threshold is higher or lower than this level.However, currently, no data to guide this is available.A suitable approach would be to stablish the contribution of each score in accordance to predefined guidelines.In that way it would be possible to evaluate a website based on its absolute values instead of its relative performance compared to other websites.

Conclusions
The use of the internet as a primary information source has increased over the years.The information available on the internet can influence patient decisions concerning robotic colorectal surgery and thus encourage them to consider alternative options, as opposed to what surgeons might suggest.The current quality of healthcare data about robotic colorectal surgery is minimal and with a high degree of inaccuracy since only 25% of the websites included were of high quality.This research reveals the lack of availability of quality information on the internet about the research topic.Thus, the medical community should develop high-quality websites to guarantee patient-centered approaches that provide relevant and accurate information regarding robotic colorectal surgery.The absence of extensive data on the topic implies that patients will have minimal information sources to refer to when conducting individual research on the internet on the subject.Future researchers should deal with this literature gap by conducting more research on the topic to ensure that other researchers, surgeons, and patients can utilize a greater volume of useful data for reference purposes.
Hence, all experts that develop informational websites on robotic colorectal surgery should do so with reliable scientific and data-driven information that patients can trust and use.Medical practitioners must evaluate and advise with information available on websites concerning robotic colorectal surgery.The availability of such data will help patients with accurate robotic understanding with the goal of improving long-term treatment results.Besides the aforementioned key points, additional effort should be invested in implementing tools and procedures so as to standardize the development of high-quality information websites, especially if patients are the target audience.

Institutional review board statement
Not applicable.

Informed consent statement
Not applicable.

Figure 1 .
Figure 1.Flow chart showing how relevant websites were identified, screened, and included in our study.

Figure 2 .
Figure 2. Boxplot of the number of the websites and their EQIP scores.

Figure 3 .
Figure 3. Countries of the websites published.

Table 1 .
Overall results of the included websites according to the Modified Ensuring Quality Information for Patients.

Table 2 .
Top rated websites.