/astminer

A tool/library for mining of path-based representations of code

Primary LanguageANTLRMIT LicenseMIT

Build Status

AstMiner

A tool/library for mining of path-based representations of code. Work in progress.

Version history

0.2

  • Mining of ASTs

0.1

About

This is an offspring of an internal utility from our ongoing research project.

Currently it supports extraction of path-based representations from code in Java and Python, but it is designed to be very easily extensible.

The default output format is inspired by code2vec.

Usage

Import

Library is available via Maven Central repository. You can add the dependency in your build.gradle file:

dependencies {
    compile "io.github.vovak:astminer:0.1"
}

Examples

A few simple usage examples can be run with ./gradlew run.

A somewhat more verbose example of usage in Java is available as well.

Extend to other languages

A new programming language can be supported in a few simple steps:

  1. Add the corresponding ANTLR4 grammar file to the antlr directory;
  2. Run the antlr4 Gradle task to generate the parser;
  3. Implement a very minimal wrapper around the generated parser. See JavaParser or PythonParser for reference.

Contribution

We believe that, thanks to extensibility, AstMiner could be valuable for many other researchers. However, our vision of potential applications is tunneled by our own work.

Please help make AstMiner easier to use by sharing your potential use cases. We would also appreciate pull requests with code improvements, more usage examples, documentation, etc.