✨ Sentencizer

Sentencizer is a class that splits texts into sentences.

Table of Contents

🌱 Prerequisites
🚀 Usages
🎉️ Example
🔍️ Reference

🌱 Prerequisites

None

🚀 Usages

🚚 Via JinaHub

using docker images

Use the prebuilt images from JinaHub in your python codes,

from jina import Flow
	
f = Flow().add(uses='jinahub+docker://Sentencizer')

or in the .yml config.

jtype: Flow
pods:
  - name: sentencizer
    uses: 'jinahub+docker://Sentencizer'

using source codes

Use the source codes from JinaHub in your python codes,

from jina import Flow
	
f = Flow().add(uses='jinahub://Sentencizer')

or in the .yml config.

jtype: Flow
pods:
  - name: sentencizer
    uses: 'jinahub://Sentencizer'

📦️ Via Pypi

Install the jinahub-text-sentencizer package.

pip install git+https://github.com/jina-ai/executor-text-sentencizer.git

Use jinahub-text-sentencizer in your code

from jina import Flow
from jinahub.text.sentencizer import Sentencizer

f = Flow().add(uses=MyDummyExecutor)

🐳 Via Docker

Clone the repo and build the docker image

git clone https://github.com/jina-ai/executor-text-sentencizer.git
cd executor-text-sentencizer
docker build -t sentencizer .

Use sentencizer in your codes

from jina import Flow

f = Flow().add(uses='docker://sentencizer:latest')

🎉️ Example

from jina import Flow, Document

f = Flow().add(uses='jinahub+docker://Sentencizer')

with f:
    resp = f.post(on='foo', inputs=Document(text='Hello. World.'), return_results=True)
    print(f'{resp}')

Inputs

Document with text containing two sentences split by a dot ., namely Hello. World..

Returns

Document with two chunks Documents. The first chunk contains text='Hello.', the second chunk contains text='World.'

🔍️ Reference

Used in the multires-lyrics-search example in: https://github.com/jina-ai/examples

mapleeit/executor-text-sentencizer