This document serves as an awesome-like curation of helpful links in using Kotlin for data science/data engineering/machine learning/optimization purposes. Please feel free to put in PR's with other links you find helpful.
Data Science is a broad, buzzwordy domain that seeks to gain insight from data. Arguably, optimization and operations research algorithms play a role in this space as well. While the incumbent programming tools in data science are R, Python, and even Scala, there is a large opportunity for Kotlin to enter this space. Kotlin can add value by closing the gap between data science and software engineering, and essentially finish what Scala started.
With Kotlin/Native on the horizon, the scope of this document will hopefully expand beyond the JVM.
Open-source applications and proof-of-concepts demonstrating data science modeling with Kotlin.
Project | Description |
---|---|
Kotlin Math Cheatsheet | How to turn mathematical symbol expressions into Kotlin code |
Bayes Email Spam Filter | A Kotlin proof-of-concept implementation of a spam filter |
Bayes User Input Prediction | A simple TornadoFX app that predicts user inputs using Naive Bayes text categorization |
Classroom Scheduler | A discrete programming model that schedules classes against one classroom |
Sudoku Solver | A showcase of constraint programming and discrete optimization |
Traveling Salesman Problem | A visual Kotlin demo of the Traveling Salesman Problem |
Driver Shift Optimizer | A linear programming model using ojAlgo to minimze the cost of driver shifts in a day |
Kotlin Simple Neural Network | A simple application built with a Kotlin-implemented neural network |
Kubed Map Visualization | A U.S. heat map of unemployment rates |
Library Name | Category | Description |
---|---|---|
Kotlin-Statistics | Analytics | Idiomatic statistical/analytical extension functions for Kotlin |
okAlgo | Optimization | Kotlin extensions to ojAlgo |
Data2Viz | Charts | Cross-platform charts and visuals for Kotlin |
Sparklin | Scaled Data Processing | Kotlin framework for Apache Spark |
Krangl | Analytics | dplyr-like data frame wrangling for Kotlin |
Koma | Computation | Scientific library for Kotlin with interop/multiplatform capabilities |
Komputation | Deep Learning | Neural network platform for Kotlin, primarily for text processing |
KotlinNLP | Natural Language Processing | Natural Language Processing framework for Kotlin |
TornadoFX | UI, Charts | Kotlin UI desktop app framework, built on top JavaFX |
TornadoFX-ControlsFX | UI | ControlsFX extensions with more data views and controls for TornadoFX |
Kotlin Jupyter | Notebook | Kotlin support for Jupyter |
JINX | Plugin | Create Excel functions with Java/Scala/Kotlin instead of VBA |
Kotlin Algorithm | Algorithm | Kotlin algorithm implementations |
Library Name | Category | Description |
---|---|---|
DeepLearning4J | Deep Learning | Deep learning library for Java |
ND4J | Computation | Efficient matrix math library for JVM |
TableSaw | DataFrame | Tabular data processing and manipulation |
Joinery | DataFrame | Tabular data processing and manipulation |
Kubed | Visualization | JavaFX-based, D3.js-like visualizations |
Dex | Charting | Java-based data visualization tool |
JSoup | Data Wrangling | HTML parsing library for Java |
Smile | ML and analytics | Comprehensive machine learning, NLP, linear algebra, graph, interpolation, and visualization system |
ojAlgo! | LP and Optimization | Helpful library for linear/mixed optimization and linear algebra |
Apache Commons Math | Math/Statistics/ML | General math, statistics, and ML library for Java |
Apache Commons IO | IO | IO Utilities |
JBlas | Linear Algebra | Linear Algebra for Java |
OptaPlanner | Optimization | Solver utility for optimization planning problems |
Charts | Charting | Scientific JavaFX charting library in development |
CoreNLP | Natural Language Processing | Natural language processing toolkit |
Renjin | Interop | R JVM implementation |
Apache Mahout | Linear Algebra | Distributed framework for regression, clustering and recommendation |
Weka | Data Mining Software | Collection of machine learning algorithms for data mining tasks |
If you already are proficient in Python but want to learn Kotlin and its potential on the data science domain.
Name | Media | Topic | Description |
---|---|---|---|
From Data Science to Production with Kotlin (O'Reilly) | Video | Kotlin | Trains Python data science professionals transitioning to Kotlin |
Kotlin for Data Science (KotlinConf) | Video | Kotlin | KotlinConf session explaining the merits of Kotlin for data science |
Kotlin for Python Programmers | Blog | Kotlin | Blog relating Kotlin concepts to a Pythonista audience |
If you are a veteran JVM/Kotlin developer trying to break into the broad, buzzwordy domain of "data science".
Name | Media | Topic | Description |
---|---|---|---|
Brandon Rohrer | Blog | ML | Excellent videos and articles on machine learning topics |
3Blue1Brown | Video | Math, ML, etc | Excellent YouTube channel visually covering mathematical concepts, including neural networks |
Make Your Own Neural Network | eBook | ML | The best practical guide on neural networks I've found |
Python for the Busy Java Developer | eBook | Python | Helpful resource for Java devs to learn Python quickly |
Data Science with Java (O'Reilly) | Book | Data Science | Teaches data science for Java developers |
Mastering Java for Data Science (Packt) | Book | Data Science | Data science for Java developers |
Mastering Java Machine Learning (Packt) | Book | ML | Machine learning for Java developers |
Discrete Optimization (Coursera) | Online Class | Optimization | Deep dive class into linear/integer programming and optimization |
Machine Learning for Absolute Beginners | eBook | ML | Excellent eBook to get high level understanding of ML |
Model Building in Mathematical Programming | Book | Optimization | Covers linear/integer programming particularly for optimization problems |
For Kotlin to become a mainstream data science platform on par with Python and R, there is still some work to do. This will depend heavily on you, the community, to help fill these gaps.
- Kotlin extensions to existing Java libs are an easy contribution opportunity (e.g. Kotlin-Statisitcs and Sparklin)
- Machine Learning- More robust machine learning libraries/API's need to be integrated with Kotlin (e.g. SMILE)
- Implement ML algorithms in Kotlin Algorithm project
- Kotlin/Native - Explore bindings against Python C libraries
- Kotlin/Native - Need a NumPy-like library, ojAlgo-like solvers a plus
- Jupyter support- There is a Jupyter plugin for Kotlin that needs development
Name | Platform | Description |
---|---|---|
PySlackers | Slack | A Slack community of Python developers and data science professionals. |
Kotlin Slack | Slack | A Slack community of Kotlin developers. Join the #datascience channel |
Name | Media | Description |
---|---|---|
Kotlin Machine Learning and Optimization | Video | Thomas' demos and walkthroughs of different optimization and machine learning algorithms in Kotlin |
KotlinConf- Kotlin for Data Science | Conference | Thomas Nield explains the merits of Kotlin on the data science domain |
KotlinConf - Kscript | Conference | Holger Brandl covers kscript for data science workflows |
Talking Kotlin - Data Science with Thomas Nield | Podcast | Thomas Nield explains the merits of Kotlin on the data science domain |
Kotlin's Emerging Data Science Ecosystem | Talk/Slides | Holger Brandl gives an update on Kotlin's state as a data science platform |