I created this web app as I wanted an easy tool to compare different car options.
This project was inspired as I myself needed a car and found sites like Kelley Blue Book and Edmunds could have improved functionality for people wanting to compare cars. (Notibly 5 year cost of ownership was missing from many cars, also would like the option to look at 6, 7, 8, or whatever year cost of ownership easily and specifically compare expenses)
To start I am just working on the backend and trying to connect as many open source datasets into one database so a better app can be built around the datasets.
- https://vpic.nhtsa.dot.gov/api/
- https://www.fueleconomy.gov/feg/ws/
- https://en.wikipedia.org/w/api.php/
First install docker and docker compose on the system. To test correct installation run
docker --version
docker compose version
To start project run
docker compose up --build
After project finishes building run
docker compose exec web python3 manage.py makemigrations
docker compose exec web python3 manage.py migrate
The command scrapes all manufacturers from the year 2023 - 2024 using fueleconomy and populates the database with the results.
docker compose exec web python3 manage.py scrape_fueleconomy --scrape-type manufacturers --start-year 2023 --end-year 2024
Scrapes manufacturer info from wikipedia
docker compose exec web python3 manage.py scrape_wikipedia --scrape-type manufacturers
The command scrapes all vehicle models for various manufacturers for the years 2023 - 2024 using nhtsa and populates the database with the results.
docker compose exec web python3 manage.py scrape_nhtsa --scrape-type models --start-year 2023 --end-year 2024
Populates Vehicle Types using nhtsa data
docker compose exec web python3 manage.py scrape_nhtsa --scrape-type vehicle_types --start-year 2023 --end-year 2024
Populates vehicle variations using fueleconomy database results
docker compose exec web python3 manage.py scrape_fueleconomy --scrape-type variations --start-year 2023 --end-year 2024
The biggest limitation of this project seems to be access to public data. Most data seems to cost money to access apis, and most free datasources are not complete or recent. I also still need a source of car sales data or at least current car price data to be added to the site.
- Look into adding more information to manufacturers using public wikipedia data
- Determine source of where vehicle sales information can be obtained legally (ideally free)
- Determine better data source for vehicle variations (current public data source is missing lots of 2024 car data)
- Add automation scripts to docker image so publically available api data is periodically refreshed
- Create API views for data so that a front end can be build around the data