This is a scraper built in Ruby. I worked on this project as a requirement to finish the Ruby section in the Microverse Main Technical Curriculum.
To test the scrapper I decided to use the H1B Salary Database website.
This site contains a database with the Labor Condition Application from the United States Department of Labor.
- Ruby 2.6.5,
- Rubygems 3.0.3,
- Nokogiri 1.10.9,
- Rest-client 2.1.0,
- Byebug 11.1.3,
- Rspec 3.9.0,
- Paint 2.2.0,
- Visual Code 1.44.2
According to Wikipedia
"...web harvesting, or web data extraction is data scraping used for extracting data from websites. Web scraping software may access the World Wide Web directly using the Hypertext Transfer Protocol, or through a web browser. While web scraping can be done manually by a software user, the term typically refers to automated processes implemented using a bot or web crawler."
To get a local copy up and running follow these simple example steps.
You need to install:
- Clone the repository.
- In your OS terminal search the repository's address and run
bundle install
The command must install the gems required to run the program, such as:
byebug
nokogiri paint
rest-client
rspec
- To run the program execute
ruby main.rb
from the bin folder. - Follow the instructions.
As a testing tool, I used RSpec.
If you are interested in learning how it works, here is a link to Introduction to RSpec from The Odin Project.
- In the root folder, run
rspec
orrspec --format documentation
to execute the tests.
- Has a user interface with instructions to follow.
- The user can choose what value provides to filter the search.
- It has an option to do a custom search.
- Displays the input provided by the user.
- If the search returns zero results it asks if the user wants to do another search.
- The user can save the results in a text file.
Rossiel Carranza
- Github: @RossielCS
- Linkedin: Rossiel Carranza
Contributions, issues, and feature requests are welcome!
Feel free to check the issues page.
Give a ⭐️ if you like this project!
This project is MIT licensed.