- Mutli Linear Regression
- Simple Linear Regression
Libraries Used
- Kotlin-grass - kotlin library for parsing csv file to data class
- Kotlin-csv - kotlin library for reading csv file
- Koma - library for scientific computing
- Apache Commons Math - OLM Multi Linear Regression
- Kotlin-Statistics - Idiomatic math and statistical extensions for Kotlin
Flavors
- Idiomatic Categorization
- Annotation Categorization
Usage: a. Idiomatic Appraoch
- create data class
- parse csv file with data class
- categorized using extension function
- create category keys
- create array of doublearray(matrix equation) for independent variables
- create array of double for independent variables
- feed arrays to OLSML
b. Class Annotation
- create a data class
- extend data class with ScientificData class
- mark class property with annotation
- parse csv file with data class
- categorize and create keys by instantiating and initializing CategoryKey
- retrieve category keys, and dependent, indepedent array of: doubles, array of doubles
- feed arrays to OLSML
c. Annoations
-
@Category
- identifies that the property is a category variable
-
@Dependent
- mark the property as dependent variable
- make sure that there is only one dependent variable annotated
Creating ScientificData class
data class Company(
val rnd: Double?,
val admin: Double?,
val marketing: Double?,
@Category
val state: String?,
@DependentVar
val profit: Double?,
@Category
val tech: String?
): ScientificData()
Parsing data to ScientificData class from resource Folder
val data = dataClassFromCsv<Company>("/Company.csv").toList()
using idiomatic approach
//-- Multi linear regression without ScientificData class and annotation
val category1 = CategoryKeys(data)
.addCategory(key ="state", cat = {it.state!!} )
// categorizing data sets
val categorizedData = data.categorized(
category = {
categorizeByVariable { map ->
map["state"] = it.state!!
}
},
numeric = {
doubleArrayOf(it.rnd!!, it.admin!!, it.marketing!!)
}
)
//-- creating array of array of doubles
val doubleEQ = DoubleEQ(category1.mappedKeys)
val xD = doubleEQ.createEQ(categorizedData) // independent
val yD = data.mapNotNull { it.profit }.toDoubleArray() // dependent
//-- solving multi linear regression
val olsml = OLSML(yD, xD)
val summary = olsml.summary()
using class annotation
//-- categorizing data sets
val category2 = CategoryKeys(data).initCategoryData()
//-- creating array of array of doubles
val doubleEQ2 = DoubleEQ(category2.getCategoryKeys())
//-- resulting array, array of doubles are arranged alphabetically according to data class property name
val xW = doubleEQ2.createEQ(category2.getCategorizedData())
val yW = category2.getDependentValues()
val olsml = OLSML(yW, xW)
val summary = olsml.summary()
removing columns from double array eg. backward elimination approach
val matProcessed = create(arrayDoubleArray)
val removedCol = matProcessed.removeColumns(1,0,3)
val arrayVal = removedCol.to2DArray()