Welcome to the e-Commerce Price Tracker repository! This project monitors the actions of platforms such as Amazon and provides you with the latest price of the products you are looking for. And it gives users precisely when it's the optimal moment to make a purchase.
First, run the development server:
npm run dev
# or
yarn dev
# or
pnpm dev
# or
bun dev
❯ npx create-next-app@latest ./
✔ Would you like to use TypeScript? … No / Yes
✔ Would you like to use ESLint? … No / Yes
✔ Would you like to use Tailwind CSS? … No / Yes
✔ Would you like to use `src/` directory? … No / Yes
✔ Would you like to use App Router? (recommended) … No / Yes
✔ Would you like to customize the default import alias (@/*)? … No / Yes
✔ What import alias would you like configured? … @/*
Creating a new Next.js app in /home/tovar/Documents/yt-tutos/js_mastery/pricewise/main.
Using npm.
Initializing project with template: app-tw
Installing dependencies:
- react
- react-dom
- next
Installing devDependencies:
- typescript
- @types/node
- @types/react
- @types/react-dom
- autoprefixer
- postcss
- tailwindcss
added 119 packages, and audited 120 packages in 14s
21 packages are looking for funding
run `npm fund` for details
found 0 vulnerabilities
Success! Created main at /home/tovar/Documents/yt-tutos/js_mastery/pricewise/main
On GitHub:
❯ Settings ❯ Branches: (https://github.com/arepaFlipper/price_scraper/settings/branch_protection_rules/new )
❯ Branch name pattern: ^(main|master)$ ❯ Protect matching branches ❯ Require a pull request before merging.
❯ Require status checks to pass before merging.
❯ wget https://drive.google.com/file/d/1v6h993BgYX6axBoIXFbZ9HQAgqbR4PSH/view?pli=1
❯ unzip public.zip -d public
It is mandatory to add it at one folder only, either public/ or app/ but not in both.
In this project I learned:
- What web scraping really is.
- Build a tool to leverage it.
- Handling
CAPTCHA
s. - Anti-scraping measures.
- IP rotation.
Aspect | Web Crawlers | Web Scraper |
---|---|---|
Main function | Navigates the web, indexes content | Extracts specific data from web pages |
Navigation | Follows links to discover new pages | Targets individual pages for data extraction. |
Usage | SEO analysis | - Any piece of data - Dataset generation for ML training |
Is what Google Chrome or other browsers use to index websites, they too navigate through out whole website first find links and rank them accordingly.
It targets specific types of websites on their page, it's more about focusing on what truly matters.
- Send HTTP request (automatic).
- Server Replies to our scrapers request with the website content (HTML, JavaScript, CSS).
- Parse the Response.
- Filter the desired data.
- Store the extracted data in a database.
- Use it for training ML models or building new apps.
Multi-Site Support: Track prices across popular eCommerce platforms.
User-Friendly Interface: Intuitive design for easy navigation.
Price History: View historical price data to make informed purchasing decisions.
Email Notifications: Receive alerts when prices drop to your specified threshold.
Customizable Settings: Tailor the tracker to suit your preferences and needs.
BRIGHT_DATA_USERNAME=
BRIGHT_DATA_PASSWORD=
BRIGHT_DATA_PORT=
MONGODB_URI=
EMAIL_USERNAME=
EMAIL_PASSWORD=
In MongoDB
you get the URI by selecting connect ❯ MongoDB Driver
, and the URI is something like:
mongodb+srv://<username>:<password>@pricescrapper.lw8rqio.mongodb.net/?retryWrites=true&w=majority
Axios is a library for making HTTP requests in Node.js and React.
❯ npm install axios
Cheerio is a library for parsing HTML and extracting data from it.
❯ npm install cheerio
The following is an example of how cheerio is used to scrape the amazon product page:
<h1 id="title" class="a-size-large a-spacing-none">
<span id="productTitle" class="a-size-large product-title-word-break">YMDK 120 Keys ABS ANSI ISO 1.5mm Blank Milk Fog OEM Profile Shine Through Keycap for MX Mechanical Keyboard RGB GK61 96 84 68 108 87</span>
</h1>
To extract the text content of the element, we can use the following code:
const $ = cheerio.load(response.data);
const title = $("#productTitle").text().trim();
It search for the element with the id of "productTitle" in the HTML document and returns the text content of that element.
Be aware that sometimes Amazon prompts the previous prices to show the discounted price, this could confuse the data extraction.
Unhandled Runtime Error
Error: Maximum call stack size exceeded
It prompts an error, because mongoDB has its own special way of creating its documents in the DB, so the fetch data is not a typical object, It contains some more complex properties which is causing this error.
I use an alternative to the tailwindcss
BUG when declaring a dynamic string for the className
parameters.
similarProducts?.length > 0
Typescript does not like this:
Diagnostics:
1. 'similarProducts.length' is possibly 'undefined'. [18048]
Fix this by adding an extra statement like this
similarProducts && similarProducts?.length > 0
This library is actually easier to work with hotmail
but
in case of gmail
we may read the documentation.
documentation.
To make TypeScript
happy including nodemailer
, add "server client" at the top of the file, and
add the types:
npm i -D @types/nodemailer
We welcome contributions from the community! If you have any ideas for improvements, bug reports, or new features, please open an issue or submit a pull request.
This project is licensed under the MIT License - see the LICENSE file for details. Acknowledgments
Thanks to JavaScript Mastery for the helpful tutorial.
Feel free to reach out if you have any questions or need assistance. Happy shopping and tracking! 🛍️✨