jpeddicord/askalono

Repeated optimization passes

Closed this issue · 2 comments

The optimize_bounds method of TextData is capable of isolating a window within input text to identify a single license chunk. It'd be nice to find multiple licenses within a file, in the case of dual licenses, etc.

My initial thought:
A new method that uses optimize_bounds repeatedly; storing the results of the call and removing (or blanking out) the matched text from the original. Then another iteration that tries optimize_bounds again. Repeat until there's no identifiable text (above, say, 0.8 confidence).

This is starting to happen via "strategies": 05bd6ac

Closing this as strategies have landed in master. Should make it out with the next release soon. \o/