- Fetches the latest youtube videos for a predefined search query (cricket and football in this case; other search terms can also be added seamlessly) and stores the data in PostgreSQL and ElasticSearch. Videos are fetched every minute. All of this happens asynchronous.
- Rolling of API keys is supported. If the quota for one key is exhausted, then the application will use the next available key for which the quota has been not exhausted.
- An endpoint can be used for filtering and sorting. All of the responses are paginated.
- filtering: Videos can be searched using their title and description. Partial matching is supported, thanks to Elastic Search
- sorting: based on
published_at
attribute of the video.
DB: PostgreSQL, Elastic Search
API Server: Django, Django Rest Framework
-
Navigation to the root of the project where docker-compose.yml file is present.
-
Add data in the .env file
-
run
docker-compose up -d
Start up the project (see instructions tab)
Use postman or curl to hit the following endpoint. By default, the results are sorted in descending order of published_at
field in Video
model. By default, limit is set to 10 and offset to 0 (for pagination)
curl localhost:8000/feed/search/
Output
{
"count": 57,
"next": "http://localhost:8000/feed/search/?limit=10&offset=10",
"previous": null,
"results": [
{
"title": "VASAI LIVE FINAL CRICKET",
"unique_video_id": "LjVDN0beLnE",
"description": "",
"published_at": "2021-01-24T09:08:43Z",
"thumbnail_url": "https://i.ytimg.com/vi/LjVDN0beLnE/default.jpg",
"created_at": "2021-01-24T09:09:00.733238Z",
"updated_at": "2021-01-24T09:09:00.733260Z"
},
{
"title": "chiefofwaco's Live PS4 Broadcast",
"unique_video_id": "rPjurIuVSgM",
"description": "",
"published_at": "2021-01-24T09:08:04Z",
"thumbnail_url": "https://i.ytimg.com/vi/rPjurIuVSgM/default.jpg",
"created_at": "2021-01-24T09:09:00.635049Z",
"updated_at": "2021-01-24T09:09:00.635072Z"
},
....
....
....
]
}
You can change the sort order by specifying it in the url
To sort in the ascending order of the published_at field
http://localhost:8000/feed/search/?sort=published_at
To sort in the descending order of the published_at field (if you don't specify sort
param, this is the default)
http://localhost:8000/feed/search/?sort=-published_at
You can change the limit and offset (for pagination)
http://localhost:8000/feed/search/?limit=10&offset=10
Videos can be searched based on their title and description.
To search for the query term cricket
localhost:8000/feed/search/?q=cricket
Now you can combine all the following options to make complex queries
http://localhost:8000/feed/search/?q=cricket football&limit=10&offset=0&sort=-published_at
The following query
- searches for videos that matches the term
cricket football
- sorts the results in the descending order of the published_at field and
- returns the first 10 items.
You can easily add support for fetching videos for more search queries. Say, you want to also search youtube for 'basketball', then open docker-compose.yml
replace
cd src && python manage.py fetchvideos cricket football
with
cd src && python manage.py fetchvideos cricket football basketball
in backgroundworker
service.
All of the videos are fetched asynchronously. The non-asyncio code is run in a threadpool using aiocrontab library (Shameless flex: I wrote this library while learning asyncio)
- All the api keys might get exhaused. Solution is to add a new api key from a diff account.
web
service in docker-compose might restart multiple time initially. This is because by the time the container starts, ES is not ready to accept connections. This will gets fixed automatically.