Add TableTransformerImageProcessor
NielsRogge opened this issue · 3 comments
Feature request
The Table Transformer is a model with basically the same architecture as DETR.
Now, when people do this:
from transformers import AutoImageProcessor
processor = AutoImageProcessor.from_pretrained("microsoft/table-transformer-detection")
print(type(processor))
this will print DetrImageProcessor
.
However, Table Transformer has some specific image processing settings which aren't exactly the same as in DETR:
from torchvision import transforms
class MaxResize(object):
def __init__(self, max_size=800):
self.max_size = max_size
def __call__(self, image):
width, height = image.size
current_max_size = max(width, height)
scale = self.max_size / current_max_size
resized_image = image.resize((int(round(scale*width)), int(round(scale*height))))
return resized_image
# this is required for the table detection models
detection_transform = transforms.Compose([
MaxResize(800),
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
])
# this is required for the table structure recognition models
structure_transform = transforms.Compose([
MaxResize(1000),
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
])
Hence we could create a separate TableTransformerImageProcessor
which replicates this.
Motivation
Would be great to 100% replicate original preprocessing settings
Your contribution
I could work on this but would be great if someone else can take this up
@NielsRogge ,
Will do that .
Great, see https://github.com/microsoft/table-transformer/blob/16d124f616109746b7785f03085100f1f6247575/src/inference.py#L39-L49 as there's a difference between the detection model and the structure recognition models
@NielsRogge just to reconfirm. we need to have a image_processing_table_transformer
defining TableTransformerImageProcessor
that has specific TableTransformer transform for structure/detect.
Any other specifics apart from that ? any other diff ? I will anyways try finding.