- Scrape the first 500 items (title, image url) from sreality.cz (flats, sell - you can switch the web to English)
- and save it in the Postgresql database.
- Implement a simple HTTP server (or use Nginx) and show these 500 items
- on a nice page (with pagination) which will use your own design
- and put everything to single docker-compose command so that I can just run "docker-compose up" in the Github repository and see the scraped ads on http://127.0.0.1:8080 page.
Use Typescript for implementation.
git clone https://github.com/JuroUhlar/scraping-postgresql-docker-exercise.git
cd scraping-postgresql-docker-exercise.git
sudo docker-compose up
- scrape flats again:
yarn scrape-flats
- Connect to db with psql:
psql -h localhost -p 5432 -U postgres
(pass: postgres) - Reset db:
sudo rm -rf pgdata
- Used Cypress for scraping. Something more lightweight would likely suffice here, but just in case we needed to interact with the page (not just parse markup) I went with Cypress. It's also the tool I am familiar with.
- I saved the data into a single
flats
table, serializing the array of image URLs into a single string for simplicity. In the real world, you would probably want a separate table for that. - For the web server, I wrote a simple Express.js app. I couldn't get
ejs
to play nice with TypeScript, so I wrote the html by hand. Obviously, for anything larger you would want to use a proper templating engine. Or use a full-stack framework like Next.js, which I considered but the assignment calls for keeping things simple. - Most of the difficulties came from setting up Postgres with
docker-compose
, which I didn't have any authoring experience with. I considered side-steping the problem by using a hosted Postrgres instance (use the path of least resistance) but I though it might be against the spirit of the exercise (struggle with solving novel problems). I made it work in the end, though it probably could be more elegant. I didn't have a second machine to test it on, so I hope it works 🤞. If not, let me know, I will try to fix it or host the application somewhere.
- Corousel to see all images available.
- Location links to Google Maps.
- Basic sorting/filtering/search.
- Each flat has indidual view with more information.
- Make navigating between pages more smooth (switch to client-side navigation after initial load).
- Make layout less jumpy as images come in.
- Improve accesibility.
- Pagination should show current page.
- Add a form to add apartments into the database.