Pipelines -- Batching sentences in document parser [ARElight backlog]
nicolay-r opened this issue · 1 comments
nicolay-r commented
This is originates from NER application. (nicolay-r/ARElight#118)
The snippet below illustrates that we apply text processing pipeline separately for each sentence (text_parser.run
).
If we want to enhance the document processing performance, there is a need to switch from a single sentence to list
of sentences. The latter denotes to support batching
.
AREkit/arekit/common/docs/parser.py
Lines 19 to 25 in 4c577cb
-
❌ These parameters could be removed:
AREkit/arekit/common/docs/parser.py
Lines 31 to 32 in 4c577cb
-
The following in actually required and cited to the related parameter in context:
nicolay-r commented