Datamole test assignment implementation of an API that monitors GitHub repositories
- http://127.0.0.1:8000/docs - Swagger documentation
- http://127.0.0.1:8000/health - Use this endpoint to check the health of the application
- http://127.0.0.1:8000/statistics - Provide basic statistics for the repositories passed in the
JSON body with
repositories
key as a list of strings.
Have a look in the Swagger documentation for more details on the endpoints.
To report statistics, one can set up which repositories.
To get a notion what it is like, you can run this in the terminal:
curl -X 'POST' \
'http://127.0.0.1:8000/statistics' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"repositories": [
"https://github.com/psf/requests",
"https://github.com/tiangolo/fastapi",
"https://github.com/astral-sh/ruff",
"https://github.com/pre-commit/pre-commit"
]
}'
To specify the repositories, please only put the GitHub links to a public ones in their common form (not the API one), such as:
https://github.com/michtesar/repo-monitor
There is a validation check on the URL itself and on the number of the repositories as well.
The max number of repositories could be changed in the config.py
.
A FastAPI is being used as a backend for the application. If this was a production
application, we would consider implementing routers to enables v1/api
or v2/api
style
routes.
So far, the application is opened on the empty (non-existing route - root /)—later we would want to implement some frontend or router dict for specified API versions.
To query the GitHub API a requests-cached
library is being used to cache the request
for 5 minutes in sqllite3
database. Again, in production we would want a /settings
endpoint
with authentication that would enable admins/users to change this. Also, it would be better to use
PostgreSQL
database, which library support as well.
For development run, please create a Python virtual environment with Python3.12:
python3.12 -m venv venv
source venv/bin/activate
Then, install all production dependencies with:
pip install -r requirements.txt
For running the application, please use the Makefile
targets. To see all
the options for development and run, run: make help
. If you just run make
everything is going to be set for you automatically:
- Clean the repository
- Install all packages
- Set the formatter and format, lint, check the code
- Run all tests
- Run the application
In case you only want to run the application you either use of Makefile
target
make run
Which will run the application on the localhost on default port 8000
fot FastAPI applications.
Or you can use uvicorn
to run the application they way you want (e.g., set the port):
uvicorn repo_monitor.main:app --port 5000
Then, navigate to:
In production, a Docker container is highly recommended.
Please find included Dockerfile
in the
repository, or use Makefile
targets:
make docker/build
- Build a Docker image with the name ofrepo-monitor
make docker/run
- Run the image into container with the name ofrepo-monitor
In case of Docker a port 5000
is being used and exposed, so please navigate to:
To visualize all the used libraries and their dependencies, please
find this table generated with pip-licenses
:
Name | Version | License |
---|---|---|
annotated-types | 0.6.0 | MIT License |
anyio | 4.3.0 | MIT License |
attrs | 23.2.0 | MIT License |
cattrs | 23.2.3 | MIT License |
certifi | 2024.2.2 | Mozilla Public License 2.0 (MPL 2.0) |
charset-normalizer | 3.3.2 | MIT License |
click | 8.1.7 | BSD License |
fastapi | 0.110.3 | MIT License |
h11 | 0.14.0 | MIT License |
idna | 3.7 | BSD License |
platformdirs | 4.2.1 | MIT License |
pydantic | 2.7.1 | MIT License |
pydantic-settings | 2.2.1 | MIT License |
pydantic_core | 2.18.2 | MIT License |
python-dotenv | 1.0.1 | BSD License |
requests | 2.31.0 | Apache Software License |
requests-cache | 1.2.0 | BSD License |
six | 1.16.0 | MIT License |
sniffio | 1.3.1 | Apache Software License; MIT License |
starlette | 0.37.2 | BSD License |
typing_extensions | 4.11.0 | Python Software Foundation License |
url-normalize | 1.4.3 | MIT License |
urllib3 | 2.2.1 | MIT License |
uvicorn | 0.29.0 | BSD License |
As it can be seen, only free and open-source libraries were used.
No security vulnerabilities were found with bandit
in distributed packages:
Run started:2024-05-02 13:07:53.668476
Test results:
No issues identified.
Code scanned:
Total lines of code: 0
Total lines skipped (#nosec): 0
Run metrics:
Total issues (by severity):
Undefined: 0
Low: 0
Medium: 0
High: 0
Total issues (by confidence):
Undefined: 0
Low: 0
Medium: 0
High: 0
Files skipped (0):
The objective of this assignment is to track activities on GitHub. To achieve this, utilize the GitHub Events API.
The application can monitor up to five configurable repositories. It generates statistics based on a rolling window of either 7 days or 500 events, whichever is less. These statistics are made available to end-users via a REST API. Specifically, the API will show the average time between consecutive events, separately for each combination of event type and repository name.
The application should minimize requests to the GitHub API and retain data through application restarts.
Please include a README file with your solution detailing the steps to run the application and a brief outline of your assumptions. Also, include reasonable documentation for fellow engineers and API users.
The assignment should be completed in Python.
It should take no more than 8 hours of cumulative work to finish. Please submit the best solution you can deliver within this time frame. If you do not manage to finish the solution fully, please report which parts are missing and sketch possible future work.
Please hand in your solution within 14 days as a ZIP file with a Git repository.