This repository has migrated to https://github.com/giganticode organization and is no longer supported.

log-recommender

This is a project for a master thesis with a title "Supporting logging activities by mining software repositories"

General Goal

The general goal is using a large number of projects from github to create a model that getting source code as input suggests different kind of information related to logging (e.g. place in code to put a logging statement, the text of the logging statement, log level etc.)

Steps

Data gathering

We use dataset from Mining source code repositories at massive scale using language modeling. M Allamanis, C Sutton

Statistics about dataset TBA

Data gathering in more details

Data preprocessing

On this step data is prepared for the lang modelling step (tokenization, reduction of vocabulary size)

Data preprocessing in more details

Language modelling

Training language models using different kinds of architecture and different parameters; analysing and comparing performance of different models.

Language modelling in more details

Building classifier

Based on pretrained language model, we build classifiers that are trained to predict the correct position of log statement in the code, their level, text and variables in log statements.

Building classifier in more details.

IntelliJ plugin building

The pluggin supports developers by helping with log decisions.

IntelliJ plugin building in more details.

Supporting activities

log-recommender-cli: a command line tool for managing datasets and their parsing, preprocessing etc.

Implementation details

Architecture and package overview in a nutshell

TBA