- This app is based on scrapping truck items and their info from specific URL. It will initially run and show the result of assigned task url : https://www.otomoto.pl/ciezarowe/uzytkowe/mercedes-benz/od-+2014/q-actros?search%5Bfilter_enum_damaged%5D=0&search%5Border%5D=created_at+%3Adesc
$ git clone https://github.com/RahatSaqib/scrapping-node.git
$ cd scrapping-node
$ npm install
$ npm run start
The api's JSON body that you will need to scrape data. Example:
You can use 1st page url of any other item list for scarpe all the existing pages from otomoto website.
Url for scrape all the pages of initial url or your desired url .
- scrapTruckItem: url : http://localhost:8443/scrape-truck-item , method : POST
If you want to individual function scrapping.
- getNextPageUrl: url : http://localhost:8443/next-url , method : POST
- addItems: url : http://localhost:8443/add-items , method : POST
- getTotalAdsCount: url : http://localhost:8443/total-ads , method : POST
$ npm run test
Ans: For error solving I have used Promise for rejecting any request to cache for identifying the exact problem or error.If any request rejects it stores the 404 NOT FOUND property on the url object. AND added retry strategies for scraping url for 3 times.
Ans: No, I can't access more ads because the given link or url has only 8/9 pages for scrapping. But rather than this given url , I have tried the Mercedes car section where there are 59 pages of items. I can scrape it but it takes too much response time.
Ans: Yes, I experienced CI/CD tools in this project. Whenever I push an update on git , it tests the api's with scripted test cases. Continuous Deployment can not happen because I did not write any script for any deployment. It is done locally.
Ans: It was a good ride for me. The task drove me to know further because it was very interesting for me. Love to learn new things and technology.