/MakeScraper

A simple web scraper built with Golang

Primary LanguageGoMIT LicenseMIT

🕷 makescraper

Go Report Card

Create your very own web scraper and crawler using Go and Colly!

📚 Table of Contents

  1. Project Structure
  2. Getting Started
  3. Deliverables
  4. Resources

Project Structure

📂 makescraper
├── README.md
└── scrape.go

Getting Started

  1. Visit github.com/new and create a new repository named makescraper.

  2. Run each command line-by-line in your terminal to set up the project:

    $ git clone git@github.com:Make-School-Labs/makescraper.git
    $ cd makescraper
    $ git remote rm origin
    $ git remote add origin git@github.com:HexSeal/makescraper.git
    $ go mod download
  3. Open README.md in your editor and replace all instances of HexSeal with your GitHub username to enable the Go Report Card badge.

Deliverables

Complete each task in the order they appear. Use GitHub Task List syntax to update the task list.

Requirements

Scraping

  • IMPORTANT: Complete the Web Scraper Workflow worksheet distributed in class.
  • Create a struct to store your data.
  • Refactor the c.OnHTML callback on line 16 to use the selector(s) you tested while completing the worksheet.
  • Print the data you scraped to stdout.
Stretch Challenges
  • Add more fields to your struct. Extract multiple data points from the website. Print them to stdout in a readable format.

Serializing & Saving

  • Serialize the struct you created to JSON. Print the JSON to stdout to validate it.
  • Write scraped data to a file named output.json.
  • Add, commit, and push to GitHub.
Stretch Challenges
  • TBA 02/10!

Resources

Lesson Plans

Example Code

Scraping

Serializing & Saving

  • JSON to Struct: Paste any JSON data and convert it into a Go structure that will support storing that data.
  • GoByExample - JSON: Covers Go's built-in support for JSON encoding and decoding to and from built-in and custom data types (structs).
  • GoByExample - Writing Files: Covers creating new files and writing to them.