This project automate the process to update recent media information to a commerce using instagram posts for use of page web
- Node.js e NPM (suportadas versões: 10.x.x)
- Mysql
if you install this project on AWS EC2 (for example) you need make this steps:
sudo apt-get install git-all
- na raiz do projeto, vá até cd ./node_modules/puppeteer
$ cd ./node_modules/puppeteer
- instale todas as dependências dele
$ npm run install
- caso necessário, instale todas as dependências necessárias no Debian para execução do navegador (Chromium)
$ sudo apt-get install gconf-service libasound2 libatk1.0-0 libc6 libcairo2 libcups2 libdbus-1-3 libgbm-dev libexpat1 libfontconfig1 libgcc1 libgconf-2-4 libgdk-pixbuf2.0-0 libglib2.0-0 libgtk-3-0 libnspr4 libpango-1.0-0 libpangocairo-1.0-0 libstdc++6 libx11-6 libx11-xcb1 libxcb1 libxcomposite1 libxcursor1 libxdamage1 libxext6 libxfixes3 libxi6 libxrandr2 libxrender1 libxss1 libxtst6 ca-certificates fonts-liberation libappindicator1 libnss3 lsb-release xdg-utils wget
apt update && apt install sudo curl && curl -sL https://raw.githubusercontent.com/Unitech/pm2/master/packager/setup.deb.sh | sudo -E bash -
npm install pm2 -g
pm2 completion install
npm install pm2 -g && pm2 update
- instagram perfil need to public
- you need create database to use, and include this in configuration of
.env
- put informations to instagram account used to login and scraping posts content in
.env
like a.env.example
- if you deploy using Heroku as needed add buildpack
https://github.com/jontewks/puppeteer-heroku-buildpack
- install all dependencies with
npm i
- run
$ npm run dev
- if you want run this project in heroku, watch this problem: if you use
free dyno
, you get a wrong result.heroku free dyno
hibernate and stop your clock process, so not run routine of scraping a instagram page. - if AWS EC2 not run
build
, execute this commands:
$ sudo /bin/dd if=/dev/zero of=/var/swap.1 bs=1M count=1024
$ sudo /sbin/mkswap /var/swap.1
$ sudo /sbin/swapon /var/swap.1
- Automated tests
- Scraping instagram page of a public perfil and get 9 recent posts image source and page ref
- Login on pseudoInstagram
- Get content of each post
- Convert and save this in database
- Save link of image and content of post in database
- Create route to get all information of 9 recent posts
- Configure Heroku Clock
- Schedule scraping of instagram every ten minutes
- Add Swagger docs
- Create routine with cron to scraping instagram information without heroku clock
- Deploy on Heroku
- Deploy in Amazon EC2