Scala for Machine Learning Version 0.96a Copyright Patrick Nicolas All rights reserved 2013-2015
=================================================================================================
Source code, data files and utilities related to "Scala for Machine Learning"
The examples are related to investment portfolio management and trading strategies. For the readers interested either in mathematics or the techniques implemented in this library, I strongly recommend the following readings:
- "Machine Learning: A Probabilistic Perspective" K. Murphy
- "The Elements of Statistical Learning" T. Hastie, R. Tibshirani, J. Friedman
The Appendix contains an introduction to the basic concepts of investment and trading strategies as well as technical analysis of financial markets. Hardware: 2 CPU core with 4 Gbytes RAM for small datasets to build and run examples.
4 CPU Core and 8+ Gbytes RAM for datasets of size 75,000 or larger and/or with 50 features set or larger
Operating system: None
Software: JDK 1.7.0_45 or 1.8.0_25, Scala 2.10.3/2.10.4 or 2.11.1 and SBT 0.13+ (see installation section for deployment.
Directory structure of the source code library for Scala for Machine Learning:
Directory structure of the source code of the examples for Scala for Machine Learning:
Library components for Scala for Machine Learning:
The installation and build workflow is described in the following diagram:
Eclipse The Scala for Machine Learning library is compatible with Eclipse Scala IDE 3.0
Specify link to the source in Project/properties/Java Build Path/Source. The two links should be project_name/src/main/scala and project_name/src/test/scala
Add the jars required to build and execute the code within Eclipse Project/properties/Java Build Path/Add External Jarsas declared in the project_name/.classpath
Update the JVM heap parameters in eclipse.ini file as -Xms512m -Xmx8192m or the maximum allowed on your specific machine.
The Simple Build Too (SBT) has to be used to build the library from the source code using the build.sbt file in the root directory
Executing the examples/test in Scala for Machine Learning require sufficient JVM Heap memory (~2G):
in sbt/conf/sbtconfig.text set Xmx to 2058m or higher, -XX:MaxPermSize to 512m or higher i.e. -Xmx4096m -Xms512m -XX:MaxPermSize=512m
Build script for Scala for Machine Learning:
To build the Scala for Machine Learning library package
$(ROOT)/sbt clean publish-local
To build the package including test and resource files
$(ROOT)/sbt clean package
To generate scala doc for the library
$(ROOT)/sbt doc
To generate scala doc for the examples
$(ROOT)/sbt test:doc
To compile all examples:
$(ROOT)/sbt test:compile
To run one test suite (i.e. Chap 3)
$(ROOT)/sbt
> test-only *Chap3 To run all tests:$(ROOT)/sbt test:run
CRF-Trove_3.0.2.jar
LBFGS.jar
colt.jar
CRF.jar
commons-math3-3.3.jar
libsvm.jar
jfreechart-1.0.17/lib/jcommon-1.0.21.jar
junit-4.11.jar
jfreechart-1.0.17/lib/jfreechart-1.0.17.jar
com.typesafe/config/1.2.1/bundles/config.jar
jfreechart-1.0.17/lib/servlets.jar
akka-actor_2.11-2.3.6.jar
scalatest_2.11.jar
spark-assembly-1.1.0-hadoop2.4.0-no_scala.jar