The purpose of this repository is to experiment with machine learning in scikit-learn to create a model that will classify code snippets into programming languages.
Training data was procured from The Computer Language Benchmarks Game: http://benchmarksgame.alioth.debian.org/
The classifier was trained on the following programming languages:
- C (.gcc, .c)
- C#
- Common Lisp (.sbcl)
- Clojure
- Java
- JavaScript
- OCaml
- Perl
- PHP (.hack, .php)
- Python
- Ruby (.jruby, .yarv)
- Scala
- Scheme (.racket)