ackeras: A Python repository from Accurat

Installation

The library is now pip installable so just go ahead and type

$ pip install ackeras

Note: As autokeras the library is only compatible with: Python 3.6.

Note double down: I refer to the Keras setup at Keras and suggest the Theano backend but you can also use Tensorflow

Disclaimer

It just started so most things do not work properly or need to be fix, there are plenty of #TODO inside, but feel free to use and to pull.

Scope

The idea is to be able to input a file in CSV or JSON format and, after selecting a few parameters (see below), getting your data cleaned and clustered automatically, ready to be analyzed. This can be useful in the context of preliminary analysis and to implement some outputs in visualization (e.g. a clustering in a scatterplot or the probabilities of a certain class with a decision tree etc.).

The implementations are:

Data cleaning: NaN filling with various methods, label encoding and one hot encoding, flagging of categorical feautures and dropping redundant feautures (almost);
Dimensionality Reduction: PCA and UMAP
Clustering: k-means, with silhoutte analysis optimization, and DBSCAN clustering;
Logistic and Linear regression, with K-fold cross validation.
Random Forests and Support Vector Machines, with genetic algorithm optimization.
Outlier detection with Random Forests and
Neural Networks, with Auto-Keras
ML visualizations with Seaborn and Lime

Usage with Python

Head over to the docs to see:

Basic usage example
More complex analysis and use cases
Integration with autosklearn
Integration with autokeras

Usage from frontend (not ideal)

You shold now be able to interact with the dataset through a simple server that is only running on my machine in the local network now. Fixing is happening anyhow so stay tuned. To test it yourself just try:

cd ackeras
$ python server.py

and head over to your localhost:5000. Upload a CSV and you should see something like this:

Be sure to tick (at this stage) the "Drop_rest", because it ensures that the data you push in and is not understood will be excluded. Then go ahead and submit query and head over to the link provided and enjoy everything breaking down. Keep an eye on the console because we tried and log most errors.

Other interesting libraries to add in the pipeline