Collection of SQL scripts/queries that are targeting machine learning (data mining) algorithms directly inside database using only standardized SQL version. Significant part of SQL was reused from multiple sources, details can be found in "Credits/References" section. Please note then terms "machine learning" and "data mining" are used here interchangeably.
It primary planed to be implemented and tested with SQLite. As GUI for SQLite SQLiteStudio can be used. Start via "start.bat" and use following configurations: .read configs.sql
> .mode csv
> .import ../data/boston_housing_data.csv TBL_BOSTON_HOUSING_IMPORT
> SELECT COUNT (*) FROM TBL_BOSTON_HOUSING_IMPORT; -- must be 506
> .read ../data/TBL_BOSTON_HOUSING.sql
-- copy from import to real
> .read ../data/COPY_FROM_IMPORT_TBL.sql
> .save ml-with-sql.db
> .read ../data/TBL_BOSTON_HOUSING_DB_FULL.sql
- SQL Linear Regression
- Optimal two variable linear regression calculation
- Single & Multiple Regression in SQL
- K Means Clustering
- Associated Items Using the Apriori Algorithm
- Classification Using Naive Bayes
- Outlier Detection with SQL Server by Stevan Bolton
- In-Database Scoring of Random Forest Models built using R via SQL
- Integrating Fuzzy c-Means Clustering with PostgreSQL
- SQL Database Primitives for Decision Tree Classifiers by Kai-Uwe Sattler and Oliver Dunemann