
Bounding box coalescence algorithms. Coalesce word-level bounding boxes into sentences, paragraphs, tables, and more.

Primary LanguageGoApache License 2.0Apache-2.0

coalbox — coalesce boxes.

The Sight API is a text recognition service. It provides word-level bounding boxes in response to uploaded PDF documents and images.

This repository provides a Go library of bounding box coalescence algorithms: it is a toolkit of functions which take word-level bounding boxes as input and return coalesced — e.g., sentence-level or paragraph-level — bounding boxes.

Take, for example, this image:

Here is the same image with word-level bounding boxes drawn on top:

After coalescing at the sentence-level...

import "github.com/siftrics/coalbox"


sentences := coalbox.ToSentences(boundingBoxes)

...the bounding boxes are much easier to work with:


import "github.com/siftrics/coalbox"

The function you will use:

func ToSentences(bbs []BoundingBox) []BoundingBox

The BoundingBox type:

type BoundingBox struct {
	Text                                                 string
	TopLeftX, TopLeftY, TopRightX, TopRightY             int
	BottomLeftX, BottomLeftY, BottomRightX, BottomRightY int
	Confidence                                           float64

If you're working with image.Rectangle, two functions are exposed to help:

func BoxFromRectangle(r image.Rectangle) BoundingBox
func BoxesFromRectangles(rs []image.Rectangle) []BoundingBox

In the resulting sentence-level bounding boxes, the Text field is creating by joining word-level bounding boxes with the space character, " ". The confidence field is the simple average of the word-level confidences.

This repository is licensed under the Apache License, Version 2.0. You can view the full text of the license in the file named "LICENSE".