- Python3 with Cython
- Pytorch 0.4 :
conda install pytorch=0.4 -c pytorch
- OpenCV 3 :
conda install opencv
- scikit-image :
conda install -c conda-forge scikit-image
- All in requirements.txt :
pip install -r requirements.txt
Recommended :
- Java (Runtime Environement) version 8 or more, to enable the Transkribus Baseline Evaluation Scheme.
SOCR Line was created with Cython. To compile it, run :
python3 setup.py build_ext --inplace
To train the network, run :
python3 train.py --icdartrain [train_path]
If you want to enable test during the training, use the command line argument --icdartest
.
Use the --help
argument for more arguments, like the batch size or the learning rate.
usage: train.py [-h] [--name NAME] [--lr LR] [--overlr] [--bs BS]
[--losstype LOSSTYPE] [--thicknesses THICKNESSES]
[--hystmin HYSTMIN] [--hystmax HYSTMAX] [--expdecay EXPDECAY]
[--heightimportance HEIGHTIMPORTANCE]
[--weightdecay WEIGHTDECAY] [--epochlimit EPOCHLIMIT]
[--bnmomentum BNMOMENTUM] [--disablecuda]
[--icdartrain ICDARTRAIN] [--icdartest ICDARTEST]
[--generated]
optional arguments:
-h, --help show this help message and exit
--name NAME
--lr LR Learning rate
--overlr Override the learning rate
--bs BS The batch size
--losstype LOSSTYPE The loss type. Ex : mse, bce, norm
--thicknesses THICKNESSES Line thicknesses in the document
--hystmin HYSTMIN Hysteresys thresholding minimum
--hystmax HYSTMAX Hysteresys thresholding maximum
--expdecay EXPDECAY Exponential decay
--heightimportance HEIGHTIMPORTANCE Height prediction importance during the training
--weightdecay WEIGHTDECAY Weight decay
--epochlimit EPOCHLIMIT Limit the number of epoch
--bnmomentum BNMOMENTUM BatchNorm Momentum
--disablecuda Disable cuda
--icdartrain ICDARTRAIN Path to the ICDAR Training set
--icdartest ICDARTEST Path to the ICDAR Testing set
--generated Enable generated data
To evaluate the network, where path is a directory or a image file, run :
python3 evaluate.py path
The result file will be created in the result
folder, in the socr-line
directory.
This is the link to ICDAR Complex Dataset :
You you want to enable test during the training, you have to split yourself the dataset into a train part and a test part.
class MyCustomDataset(Dataset):
def __init__(self, path, loss=None):
self.loss = loss
...
def __getitem__(self, index):
image_path, regions = self.labels[index % len(self.labels)]
image = Image.open(image_path).convert('RGB')
width, height = image.size
...
label = self.loss.document_to_ytrue(np.array([witdth, height], dtype='int32'), np.array(regions, dtype='int32'))
image = np.array(image, dtype='float') / 255.0
return torch.from_numpy(image), torch.from_numpy(label)
def __len__(self):
return len(self.labels)
Just like a normal Pytorch model :
class MyCustomModel(torch.nn.Module):
def __init__(self):
super(dhSegment, self).__init__()
self.conv = torch.nn.Conv2d(3, 2, kernel_size=7, padding=3, stride=2, bias=False)
def forward(self, input):
input = self.conv(input)
return input
def create_loss(self):
return MyCustomLoss()
class XHeightCCLoss(torch.nn.Module):
"""An absolute position Loss"""
def __init__(self):
"""
:param s: grid division, assuming we have only 1 bounding box per cell
"""
super().__init__()
self.mse = torch.nn.MSELoss()
self.decoder = BaselineDecoder()
self.encoder = BaselineEncoder()
def forward(self, predicted, y_true):
predicted = predicted.permute(1, 0, 2, 3).contiguous()
y_true = y_true.permute(1, 0, 2, 3).contiguous()
return self.mse(predicted, y_true)
def document_to_ytrue(self, image_size, base_lines):
return self.encoder.encode(image_size, base_lines)
def ytrue_to_lines(self, image, predicted, with_images=True):
return self.decoder.decode(image, predicted, with_images, degree=3, brut_points=True)
Use the --generated
argument to use generate document with ICDAR.
To generate document with handwritten text, you will need to download the IAM dataset from here : IAM Handwriting Database. At the initialization, please call init_iam_handwriting_line_dataset from scribbler.ressources.ressources_helper with the path of IAM dataset".