/Scraper

A web scraper application that fetches some specified set of data from one of my favorite JavaScript tutorial websites (https://javascript.info). Built with Ruby and Nokogiri Gem.

Primary LanguageRuby

Web-Scraper - Ruby Capstone Project

This project aims to develop a Web Scraper using Ruby. It is a capstone project that marks the completion of Ruby module in Microverse technical curriculum. The full project specification can be found here.

screenshot

Built with

  • Ruby
  • Nokogiri gem
  • Linter (Stickler, Rubocop)
  • Httparty gem
  • Rspec gem

Watch Video Demo Here

Getting Started

To get a local copy up and running follow these simple steps.

  • On the project GitHub page, navigate to the main page of the repository this page.
  • Under the repository name, locate and click on a green button named 'Clone or download'.
  • Copy the project URL as displayed.
  • If you're running Windows Operating System, open your command prompt. On Linux, Open your terminal.
  • Change the current working directory to the location where you want the cloned directory to be made. Leave as it is if the current location is where you want the project to be.
  • Type git clone, and then paste the URL you copied in Step 3.
    $ git clone https://github.com/teekaytech/Scraper.git
  • Press Enter. Your local copy will be created.

Please Note that you must have github installed on your PC, this can be done here.

Prerequisites

Since all the code is written using ruby, Ruby Runtime >= 1.9 is required to interpret the code. If you don't have ruby runtime already setup on your computer, follow the instruction for your specific operating system on the official installation guide. Internet Connection is a must, for successful scraping

Usage

After cloning on your computer, open the folder using your terminal/command prompt window and run

bin/main.rb

Web Scraping :: Introduction

Web scraping, web harvesting, or web data extraction is data scraping used for extracting data from websites. Web scraping software may access the World Wide Web directly using the Hypertext Transfer Protocol, or through a web browser. read-more.

Why I chose this scraper

I chose to scrape this website because this resource have been important to my personal development before my microverse days. I love the way in which the tutorials are structured and it really helped me during my coding challenges in the application process. Right now, I still visit the website and I believe this scraper will be a major player to my development when I get to JavaScript Module.

How to use the Scraper

  • The app scrapes a webpage consisting of a Complete JavaScript Modern Tutorial, their categories with the links respectively.
  • It displays the result on the command prompt / terminal.
  • Users can decide to use any of the three scrape options available.
  • When the app is started (bin/main.rb), the app instructs the user to choose from the displayed options.
    screenshot
  • Depending on the option selected, the user gets the result of the scraping process. For option 1:
    screenshot
  • For option 2:
    screenshot
  • For option 3:
    screenshot
  • Inserting any other value aside these three options will terminate the app.

Note: The app requires internet connection for successful scraping!

Running the Automated Test

RSpec gem was used for the automated testing. To run the test cases,

  • Open your terminal/command prompt
  • Type rspec on the terminal and press the enter button.

Author

👤 Taofeek Olalere

🤝 Contributing

Contributions, issues and feature requests are welcome!

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request.
    Feel free to check the issues page.

Show your support

Give a ⭐️ if you like this project.

📝 License

This project is MIT licensed.

Acknowledgements