A source for time series reports on the top and new torrent files.
Every 24 hours our site begins scraping the top torrents from a series of torrent sites. We identify unique torrents and store the count of seeders and leechers for every torrent.
****** This project is an active work in progress******
- React 16
- Redux
- React router
- Web Workers
- Axios
- New Addition! Styled-Components
- Formerly SASS with post-css
Using Node.js with express, sequelize, postgres, and puppeteer (the headless chrome utility) in our scraping. we can have our application serve up database queries and perform our information gathering.
- Server: Express, body-parser
- Persistance: GraphQL planned , Sequelize, Postgres
- Authentication: Passport, Passport-Google
- Testing: Mocha, Chai
The central figure in this show is the reporterAgent (./server/reporterAgent), a custom built tool using puppeteer to scrape data from a wide varierty torrent websites. It runs as a scheduled service that fetches each torrent sites content independently and retries any sites that are down.
Built using React 16 with redux, redux-thunks, and without any front-end css frameworks (custom flexbox). The application has been optimized for two screen sizes. anything less than 800 pixels wide is considered a condensed "mobile" view. Using SASS (mixins and variables),
The application currently utilizes a RESTful api, but I am rewriting it using GraphQL. This will allow me to utilize Relay with React and greatly simplify the data-fetching.
- Copy or clone this git repository.
- Run
npm install
oryarn
. - install PostgreSQL and Redis. Launch them both or the application will crach.
- Fill in the enviroment variables bellow and save it as
.env
in the root project directory:
REDIS_URL=redis://localhost:6379
DATABASE_URL=postgres://localhost:5432/TorrentReport
EMAIL_HOST=
EMAIL_PASS=
EMAIL_PORT=
EMAIL_SEC=true
EMAIL_USER=
ADMIN_EMAIL=
GOOGLE_CALLBACK=/auth/google/callback
GOOGLE_CLIENT_ID=
GOOGLE_CLIENT_SECRET=
NODE_ENV=development
SESSION_SECRET=
EMAIL_HOST
is the SMTP domain of your provider.
EMAIL_PASS
is your passowrd to be used with the email provider.
EMAIL_USER
is your email to be used with the email provider.
ADMIN_EMAIL
is the email which will be notified of errors and from where site emails will be sent.
-
The scrape service runs on a schedule, but will not run more than once every 12 hours on its own. If you are just installing this project and want to have some data to play with run
npm run start-scrape
oryarn start-scrape
. -
To launch the project run
npm run start-dev
oryarn start-dev
and navigate to http://localhost:8080. NOTE You can set an enviroment variablePORT=
for any port number and use that port. I have omitted it as it defaults to port 8080.