Parse patches on a per-language basis

Question

Parse patches on a per-language basis

damevski opened this issue 7 years ago · 3 comments

Short term goals:

tokenize changesets (for each lang.)
remove stop words (for each lang.)

and insert into Solr

Answer 1 · 2017-12-01T15:59:15.000Z

We could add a few regular expressions specific to each language here. For instance, removing variable names if they are on the left side of an '=' sign in most imperative languages. It would help control the noise.

Answer 2 · 2017-12-12T20:15:52.000Z

Updated issue title to be more accurate, #15 ties into this.

Answer 3 · 2018-01-25T18:14:14.000Z

Use SrcML for now