Main Article Content

Authors

Manasvi Srivastava
Vikas Yadav
Manasvi Srivastava

Abstract

The Internet is the largest source of information created by humanity. It contains a variety of materials available in various formats, such as text, audio, video, and much more. In all, web scraping is one way. There is a set of strategies here in which we get information from the website instead of copying the data manually. Many webbased data extraction methods are designed to solve specific problems and work on ad hoc domains. Various tools and technologies have been developed to facilitate web scraping. Unfortunately, the appropriateness and ethics of using these web scraping tools are often overlooked. There are hundreds of web scraping software available today, most of them designed for Java, Python, and Ruby. There is also open-source software and commercial software. Web-based software such as Yahoo! Pipes, Google Web Scrapers, and Firefox extensions for Outwit are the best tools for beginners in web cutting. Web extraction is basically used to cut this manual extraction and editing process and provide an easy and better way to collect data from a web page and convert it into the desired format and save it to a local or archive directory. In this study, among other kinds of scrub, we focus on those techniques that extract the content of a web page. In particular, we use scrubbing techniques for a variety of diseases with their own symptoms and precautions

Share This Article On Social Media
Usage Statistics

Downloads

Download data is not yet available.

Article Details

Section
Methods