/Python-LCV-Search-Engine

Updated version of Python distributed crawler- A search engine. It serves as the Google Chrome web browser as its principal user interface.

Primary LanguageHTML

Data Search Engine

Project description:
• Provided a tag rendering a Custom Search Engine input field and a view displaying search results.
• Web-crawling from well-known website of real data in python.
• Designed Scrapy-Redis distributed crawler development and database.
• Processed data into Elasticsearch engine, and used Django framework to build web application.

# Description: This projec is used to elastic search to search a certain prefix, and then display all the
information we have found. Ex: you type "java, then I will display all the result that contains this
keywords.

#ArticleSpider_ has all the sources code for getting data from website;
It can pass the
<I’m not a robot’ verification test>
of each website

Screen Shot

Suggest view

suggest

Homes view with hot search and history

home

Search View

search

demo View

dem

division pages View

divides

The search engine has:

Hot search list(The most frequency search)
Seach history
The number of results, pages, and times
The number of total data in the Database The information will be restored in the elasticSearch head
using the syntax of Kibana

Codes Description:

/ArticleSpider # This is where we write the python project to extract all the data from website
/LcvSearch #This is django website to search data according to what you have typed
/demo #This is the image / demo of my project
/database #This has the data base (Navicat) in .sql formate

The web crawle:

https://www.zhihu.com
http://www.jobbole.com/
https://www.lagou.com/

GUI | DB Admin Tool for MySQL, MariaDB, SQL Server:

Navicat

Requirements

Python Version:3.50
elasticsearch:5.11
kibana-5.1.2-windows-x86
Redis 4.0
elasticsearch-head
https://github.com/mobz/elasticsearch-head
IDE: PyCharm
Django-admin --version 1.11.3

#usage

run elasticsearch.bat

run kibana.bat

run C:\Program Files\Redis\redis-server.exe

go to /elasticsearchhead, then run

npm run start

Then open pycharm, run
lcvsearch
open
http://127.0.0.1:9200/ //kibana
http://127.0.0.1:9100/ //head
Starting development server at http://127.0.0.1:8000/