Jarbas — a tool for Serenata de Amor
Jarbas is part of Serenata de Amor — we fight corruption with data science.
Jarbas is in charge of making data from CEAP more accessible. In the near future Jarbas will show what Rosie thinks of each reimbursement made for our congresspeople.
Each Reimbursement
object is a reimbursement claim made by a congressperson. Each reimbursement isidentified by an unique combination of year
, applicant_id
and document_id
.
Details from a specific reimbursement. If receipt_url
wasn't fecthed yet, the server won't try to fetche it.
URL of the digitalized version of the receipt of this specific reimbursement.
If receipt_url
wasn't fecthed yet, the server will try to fetch it.
If you append the parameter force
(i.e. GET /api/reimbursement/<year>/<applicant_id>/<document_id>/receipt/?force
) the server will re-fetch the receipt URL.
Not all receipts are available, so this URL can be null
.
Lists all reimbursements.
Lists all reimbursements from a specific year
.
Lists all reimbursements from a specific year
and applicant_id
.
All these endpoints accepts any combination of the following parameters:
applicant_id
cnpj_cpf
document_id
issue_date_start
(inclusive)issue_date_end
(exclusive)month
subquota_id
year
order_by
:issue_date
(default) orprobability
(both descending)
For example:
GET /api/reimbursement/2016/?cnpj_cpf=11111111111111&subquota_id=42&order_by=probability
This request will list:
- all 2016 reimbursements
- made in the supplier with the CNPJ 11.111.111/1111-11
- made according to the subquota with the ID 42
- sorted by the highest probability
Also you can pass more than one value per field (e.g. document_id=111111,222222
).
Subqoutas are categories of expenses that can be reimbursed by congresspeople.
Lists all subquotas names and IDs.
Accepts a case-insensitve LIKE
filter in as the q
URL parameter (e.g. GET /api/subquota/?q=meal
list all applicant that have meal
in their names.
An applicant is the person (congressperson or theleadership of aparty or government) who claimed the reimbursemement.
Lists all names of applicants together with their IDs.
Accepts a case-insensitve LIKE
filter in as the q
URL parameter (e.g. GET /api/applicant/?q=lideranca
list all applicant that have lideranca
in their names.
A company is a Brazilian company in which congressperson have made expenses and claimed for reimbursement.
This endpoit gets the info we have for a specific company. The endpoint expects a cnpj
(i.e. the CNPJ of a Company
object, digits only). It returns 404
if the company is not found.
There is also a tapioca-wrapper for the API. The tapioca-jarbas can be installed with pip install tapioca-jarbas
and can be used to access the API in any Python script.
If you have some issues with settings, maybe this section can be helpful.
The best way to get started is by copying the contrib/.env.sample
as .env
:
cp contrib/.env.sample .env
If you have Docker (with Docker Compose) and make, just run:
make run.devel
or
docker-compose up -d --build
docker-compose run --rm jarbas python manage.py migrate
docker-compose run --rm jarbas python manage.py ceapdatasets
docker-compose run --rm jarbas python manage.py collectstatic --no-input
You can access it at localhost:8000
. However your database starts empty, but you can use sample data to development using this command:
make seed.sample
or
docker-compose run --rm jarbas python manage.py reimbursements contrib/sample-data/reimbursements_sample.xz
docker-compose run --rm jarbas python manage.py companies contrib/sample-data/companies_sample.xz
docker-compose run --rm jarbas python manage.py irregularities contrib/sample-data/irregularities_sample.xz
You can get the datasets running Rosie or directly with the toolbox.
Jarbas requires Python 3.5, Yarn, and PostgreSQL 9.4+.
Once you have pip
and yarn
available install the dependencies:
yarn install
python -m pip install -r requirements.txt
In some Linux distros lzma
is not installed by default. You can check whether you have it or not with $ python -m lzma
. In Debian based systems you can fix that with $ apt-get install liblzma-dev
or in macOS with $ brew install xz
— but you mihght have to re-compile your Python.
Copy contrib/.env.sample
as .env
in the project's root folder and adjust your settings. These are the main variables:
DEBUG
(bool) enable or disable Django debug modeSECRET_KEY
(str) Django's secret keyALLOWED_HOSTS
(str) Django's allowed hostsUSE_X_FORWARDED_HOST
(bool) Whether to use theX-Forwarded-Host
headerCACHE_BACKEND
(str) Cache backend (e.g.django.core.cache.backends.memcached.MemcachedCache
)CACHE_LOCATION
(str) Cache location (e.g.localhost:11211
)SECURE_PROXY_SSL_HEADER
(str) Django secure proxy SSL header (e.g.HTTP_X_FORWARDED_PROTO,https
transforms in tuple('HTTP_X_FORWARDED_PROTO', 'https')
)
DATABASE_URL
(string) Database URL, must be PostgreSQL since Jarbas uses JSONField.
AMAZON_S3_BUCKET
(str) Name of the Amazon S3 bucket to look for datasets (e.g.serenata-de-amor-data
)AMAZON_S3_REGION
(str) Region of the Amazon S3 (e.g.s3-sa-east-1
)AMAZON_S3_CEAPTRANSLATION_DATE
(str) File name prefix for dataset guide (e.g.2016-08-08
for2016-08-08-ceap-datasets.md
)
GOOGLE_ANALYTICS
(str) Google Analytics tracking code (e.g.UA-123456-7
)GOOGLE_STREET_VIEW_API_KEY
(str) Google Street View Image API key
Once you're done with requirements, dependencies and settings, create the basic database structure:
$ python manage.py migrate
Now you can load the data from our datasets and get some other data as static files:
$ python manage.py reimbursements <path to reimbursements.xz>
$ python manage.py irregularities <path to irregularities.xz file>
$ python manage.py companies <path to companies.xz>
$ python manage.py ceapdatasets
You can get the datasets running Rosie or directly with the toolbox.
We generate assets through NodeJS, so run it before Django collecting static files:
$ yarn assets
$ python manage.py collectstatic
Not sure? Test it!
$ python manage.py check
$ python manage.py test
$ yarn test
Run the server with $ python manage.py runserver
and load localhost:8000 in your favorite browser.