/Code-Smells-Detection

Code smells classifier using automatically extracted source code features by pre-trained Code2Vec and/or LSTM. Four code smells (DATA_CLASS,GOD_CLASS,FEATURE_ENVY,LARGE_METHOD) are covered.

Primary LanguageJava

Code Smells Classifier Using Automatically Learned Source Code Features

Data preparation

Java project CodeAnalysis is used to transform Java classes & methods extracted from files into unified represenation records. Both classes and methods are represented as a list of methods. Method_0 is introduced to capture class'es context and it has all surrounding class'es fields.

dataset was adopted from http://essere.disco.unimib.it/wiki/research/mlcsd

Classifier

Feed forward neural network is trained based on features extracted from a processed dataset using LSTM (https://github.com/D-a-r-e-k/Source-Code-Modelling) / Code2Vec (https://github.com/tech-srl/code2vec)

Theoretical work This work is based on Master thesis "Code smells detection using automatic source code features learning" Summary (in Lithuanian) is published in Vilnius University Proceedings „Lietuvos magistrantų informatikos ir IT tyrimai“ (http://www.journals.vu.lt/proceedings/issue/view/1129, 11th page)