josemarialuna
PhD in Computer Science and AI at the University of Seville, Spain. #Python #Scala #Spark #DataScientist
University of SevilleSeville, Spain
Pinned Repositories
BasicSpark
Chi-Index
Clustering Validity Index based on Chi Square as Python package
ClassificationPython
clusterEmpleo
ClusterIndices
This package contains the code for executing clustering validity indices in Spark. The package includes BD-Silhouette, BD-Dunn, Davies-Bouldin and WSSSE indices.
CreateRandomDataset
This package contains the code for generating Big Data random datasets in Spark.
ExternalValidity
This package contains the code for calculating external clustering validity indices in Spark. The package includes Chi Index among others.
linkage
RandomClustersGenerator
📊 Python tool for creating datasets with clusters using a normal distribution. Customize clusters, significant columns, and add variability with dummy columns. Ideal for testing clustering algorithms.
smallDataIndex
This package contains the code for executing clustering validity indices in Java by using K-means from Weka. The package includes the following clustering validity indices: Silhouette, Dunn, BD-Silhouette, BD-Dunn, Davies-Bouldin, Calinski-Harabasz, MaximumDiameter, SquaredDistance, AverageDistance, AverageBetweenClusterDistance, MinimumDistance.
josemarialuna's Repositories
josemarialuna/Chi-Index
Clustering Validity Index based on Chi Square as Python package
josemarialuna/ClusterIndices
This package contains the code for executing clustering validity indices in Spark. The package includes BD-Silhouette, BD-Dunn, Davies-Bouldin and WSSSE indices.
josemarialuna/ExternalValidity
This package contains the code for calculating external clustering validity indices in Spark. The package includes Chi Index among others.
josemarialuna/RandomClustersGenerator
📊 Python tool for creating datasets with clusters using a normal distribution. Customize clusters, significant columns, and add variability with dummy columns. Ideal for testing clustering algorithms.
josemarialuna/ClassificationPython
josemarialuna/CreateRandomDataset
This package contains the code for generating Big Data random datasets in Spark.
josemarialuna/smallDataIndex
This package contains the code for executing clustering validity indices in Java by using K-means from Weka. The package includes the following clustering validity indices: Silhouette, Dunn, BD-Silhouette, BD-Dunn, Davies-Bouldin, Calinski-Harabasz, MaximumDiameter, SquaredDistance, AverageDistance, AverageBetweenClusterDistance, MinimumDistance.
josemarialuna/BasicSpark
josemarialuna/linkage
josemarialuna/clusterEmpleo
josemarialuna/Clustering-Datasets
This repository contains the collection of UCI (real-life) datasets and Synthetic (artificial) datasets (with cluster labels and MATLAB files) ready to use with clustering algorithms.
josemarialuna/electric
josemarialuna/josemarialuna
josemarialuna/matrixGeneration
josemarialuna/MethodComparisonsInPython
Friedman tests for comparing multiple methods across datasets in python
josemarialuna/SeminarioDeRiquelme
Clustering
josemarialuna/SurvivalAnalysis