Repeated optimization passes

Question

Repeated optimization passes

Closed this issue 6 years ago · 2 comments

The optimize_bounds method of TextData is capable of isolating a window within input text to identify a single license chunk. It'd be nice to find multiple licenses within a file, in the case of dual licenses, etc.

My initial thought:
A new method that uses optimize_bounds repeatedly; storing the results of the call and removing (or blanking out) the matched text from the original. Then another iteration that tries optimize_bounds again. Repeat until there's no identifiable text (above, say, 0.8 confidence).

Answer 1 · 2018-08-07T20:49:41.000Z

This is starting to happen via "strategies": 05bd6ac

Answer 2 · 2018-08-28T23:00:57.000Z

Closing this as strategies have landed in master. Should make it out with the next release soon. \o/