Company Name : Vedic Span
Team alloted : Puneeth, Sebastian, Dhanush
This project is especially done for the internship purpose
Web scraping : It is a process of extracting and parsing data from a specific link or a site and storing the content required for users like (E-mails, Phone no) etc in bunch of CSV file or files in other format
- Browsing done through different sites and picked up on to scrape . Just went through some Project ideas
- Identifying the info that i would like to scrape from the particular site and decide the output of the file (CSV)
- Summarising through the project ideas and thinking about the strategy in PyCharm
- In the first week we started discussing about our project and plan various software engineering phases.
- Design UI.
- Design prototype UI.
- Build Flask Boiler plate
- In the second week we started by creating phone regex.
- Creating email regex.
- Algorithm for web-scraping
- In the third week we started scraping data from the pre-defined website.
- Building HTML skeleton.
- Scraping website from form input.
- Updating email regex.
- Updating phone regex.
- Adding download CSV.
- In the fourth week we started by fixing the download CSV.
- Update HTML with UI.
- Testing.
- Documentation.
- Create a user-friendly interface.
- Creating an algorithm for scraping the data.
- Storing the data in CSV file.
- Python
- Flask
- Regular expression (regex)
- UI design
- Testing Documentation
- Successfully developed a fully functional application for extracting data such as email and phone number from different websites.