/Spider-the-crawler

A webcawler in python

Primary LanguagePython

Project Name : Spider-the-crawler

Table of Content :

1.Description

2.Installations

3.Usage

Description:

A web crawler (also known as a web spider or web robot) is a program or automated script which browses the World Wide Web in a methodical, automated manner.

Installations:

-requests

-BeautifulSoup

-urllib

Usage:

The program will fetch all the links from a website and store that in a file named crawled.txt and provide the list of links present on the site. It can also be used to scrape contents from websites with slight modification,with built in multi-threading it can create spiders according to the need and speed of user.

Demo

python spider.py

You can just use this to run the spider by providing the url of the site to crawl and how many threads to create. The crawler needs slight modification if you want to scrape some other content.