A tool to manage identities.
Sorting Hat maintains an SQL database of unique identities of communities members across (potentially) many different sources. Identities corresponding to the same real person can be merged in the same individual
, with a unique uuid. For each individual, a profile can be defined, with the name and other data shown for the corresponding person by default.
In addition, each individual can be related to one or more affiliations, for different time periods. This will usually correspond to different organizations in which the person was employed during those time periods.
Sorting Hat is a part of the GrimoireLab toolset, which provides Python modules and scripts to analyze data sources with information about software development, and allows the production of interactive dashboards to visualize that information.
In the context of GrimoireLab, Sorting Hat is usually run after data is retrieved with Perceval, to store the identities obtained into its database, and later merge them into individuals (and maybe affiliate them).
- Python >= 3.8
- Poetry >= 1.1.0
- MySQL >= 8.1 or MariaDB >= 10.4
- Django = 4.2
- Graphene-Django >= 2.0
- uWSGI >= 2.0
You will also need some other libraries for running the tool, you can find the whole list of dependencies in pyproject.toml file.
To install from the source code you will need to clone the repository first:
$ git clone https://github.com/chaoss/grimoirelab-sortinghat
$ cd grimoirelab-sortinghat
We use Poetry for managing the project. You can install it following these steps.
Before you install SortingHat tool you might need to install mysql_config
command. If you are using a Debian based distribution, this command can be
found either in libmysqlclient-dev
or libmariadbclient-dev
packages
(depending on if you are using MySQL or MariaDB database server). You can
install these packages in your system with the next commands:
- MySQL
$ apt install libmysqlclient-dev
- MariaDB
$ apt install libmariadbclient-dev-compat
Note: these examples use sortinghat.config.settings
configuration file.
In order to use that configuration you need to define the environment variable
SORTINGHAT_SECRET_KEY
with a secret. More info
here.
Install the required dependencies (this will also create a virtual environment).
$ poetry install
Activate the virtual environment:
$ poetry shell
Database creation, apply migrations and fixtures, deploy static files, and create a superuser:
(.venv)$ sortinghat-admin --config sortinghat.config.settings setup
Run SortingHat backend Django app:
(.venv)$ ./manage.py runserver --settings=sortinghat.config.settings
To compile and run the frontend you will need to install yarn
first.
The latest versions of yarn
can only be installed with npm
- which
is distributed with NodeJS.
When you have npm
installed, then run the next command to install yarn
on the system:
npm install -g yarn
Check the official documentation for more information.
Install the required dependencies
$ cd ui/
$ yarn install
Run SortingHat backend Django app:
(.venv)$ ./manage.py runserver --settings=config.settings.devel
Build the frontend and watch for changes:
$ yarn watch --api_url=http://localhost:8000/api/ --publicpath="/static/" --mode development
Starting at version 0.8, SortingHat is released with a server app. The server has two
modes, production
and development
.
When production
mode is active, a WSGI app is served. The idea is to use a reverse
proxy like NGINX or similar, that will be connected with the WSGI app to provide
an interface HTTP.
When development
mode is active, an HTTP server is launched, so you can interact
directly with SortingHat using HTTP requests. Take into account this mode is not
suitable nor safe for production.
You will need a django configuration file to run the service. The file must be accessible
via PYTHONPATH
env variable. You can use the one delivered within the SortingHat
package (stored in sortinghat/config
folder) and modify it with your parameters.
Following examples will make use of that file.
In order to run the service for the first time, you need to execute the next commands:
Build the UI interface:
$ cd ui
$ yarn install
$ yarn build --mode development
If you want to run the UI at /identities
run (you need to use the server
behind a proxy server):
$ yarn build
Set a secret key:
$ export SORTINGHAT_SECRET_KEY="my-secret-key"
Set up the service creating a database, deploying static files, and adding a superuser to access the app:
$ sortinghat-admin --config sortinghat.config.settings setup
Run the server (use --dev
flag for development
mode):
$ sortinghatd --config sortinghat.config.settings
By default, this runs a WSGI server in 127.0.0.1:9314
. The --dev
flag runs
a server in 127.0.0.1:8000
.
You will also need to run some workers to execute tasks like recommendations or affiliation. To start a worker run the command:
$ sortinghatw --config sortinghat.config.settings
To start a worker that processes jobs from a set of tenants when
dedicated_queue
is active (see below)
use the next command:
$ sortinghatw --config sortinghat.config.settings tenant_A tenant_B
To create new accounts for SortingHat use the following command:
(.venv)$ sortinghat-admin create-user
Usage: sortinghat-admin create-user [OPTIONS]
Create a new user given a username and password
Options:
--username TEXT Specifies the login for the user.
--is-admin Specifies if the user is superuser.
--no-interactive Run the command in no interactive mode.
SortingHat 0.7.x is no longer supported. Any database using this version will not work.
SortingHat databases 0.7.x are no longer compatible. The uidentities
table was renamed
to individuals
. The database schema changed in all tables to add the fields created_at
and last_modified
. Also in domains
, enrollments
, identities
, profiles
tables,
there are some specific changes to the column names:
domains
organization_id
toorganization
enrollments
organization_id
toorganization
uuid
toindividual
identities
uuid
toindividual
profiles
country_code
tocountry
uuid
toindividual
Please update your database running the following command:
$ sortinghat-admin --config sortinghat.config.settings migrate-old-database
SortingHat allows hosting multiple instances with a single service having each instance's data isolated in different databases.
To enable this feature follow these guidelines:
-
Set
MULTI_TENANT
settings toTrue
. -
Define a list of tenants using the configuration file
sortinghat/config/tenants.json
. You can use a different json file using the environment variableSORTINGHAT_MULTI_TENANT_LIST_PATH
. The file should have the next schema:{ "tenants": [ {"name": "tenant A", "dedicated_queue": true}, {"name": "tenant B", "dedicated_queue": false} ] }
Where
name
is the name of each tenant anddedicated_queue
is a boolean value to set whether jobs will be run on a specific queue with the same tenant name. -
Assign users to tenants with the following command:
sortinghat-admin set-user-tenant username header tenant
-
The selected tenant should be included in the request using the
sortinghat-tenant
header.
There are some limitations:
default
database is only used to store users information and relations between users and databases, it won't store anything else related with SortingHat models.- Usernames are shared across all instances, which means that it is not possible to have the same username with two different passwords in different instances.
- Tenants with
dedicated_queue
set as active will add their jobs to the queue of the same name. Queues will be created by SortingHat but, you will have to run a worker that processes that query.
SortingHat comes with a comprehensive list of unit tests for both frontend and backend.
(.venv)$ ./manage.py test --settings=config.settings.config_testing
(.venv)$ ./manage.py test --settings=config.settings.config_testing_tenant
$ cd ui/
$ yarn test:unit
Licensed under GNU General Public License (GPL), version 3 or later.