The selected base W2C gensim model is taken from Materials Science Word Embeddings.
It represents Word2Vec model trained across 640k+ materials science journal articles.
The model files are placed in models/base
.
The preprocessing of the data is done using the Jupyter Notebook Notebooks/0. Preprocess data.ipynb
The data generated by the notebook is stored in data/preprocessed_dataset.pickle
Models are trained by using a pre-trained base model (See Base Model) and given set hyperparameters.
The newly trained models are stored in models/
.
pip install -r requirements.txt
cd w2v_service
python manage.py makemigrations
python manage.py migrate
python runserver
Or use the provided Dockerfile
.
Description: Add hyperparameter(s) for re-training model(s)
Note: Only one combination of unique hyperparameters is stored in the database.
Note: If any of the supplied parameters is a list, then the hyperparameters search space is expanded and all possible combinations of the given hyperparameters are created.
Parameters:
- start_alpha: float or comma separated list of floats
- end_alpha: float or comma separated list of floats
- epochs: int or comma separated list of ints
List all hyperparameters stored in the database
Triggers the training of models for all hyperparameters
Triggers the training of a single set of hyperparameters given by id
List all training session
Show details of the training session for a given id
Shows statistics for hyperparameters models trained.
Filters:
- start_alpha: (float) - shows information for a specific
start_alpha
value - end_alpha: (float) - shows information for a specific
end_alpha
value - epochs: (int) - shows information for a specific
epochs
value
Note: The provided information is only for succesfull training sessions