about the source file for textract
Larbo53 opened this issue · 2 comments
Good morning,
here is the command I use to extract data from an image. I always use the same name 'test' ='img.png' for the variable 'Name' and response always returns the same result whatever the content of the 'test' file.
How can I get the content of the new 'test' file? Do I have to change the name of the source file every time?
Thanks for your feedback.
Thank you for your feedback.
response = textractmodule.detect_document_text(
Document={
'S3Object': {
'Bucket': s3BucketName,
'Name': test
}
})
Is this a question about using Textract or the trp?
sorry for my late reply.
Now I use the following code, without storing the document on the S3 service, as below, and it works.
Thank you.
"
im = Image.open(path+"image.png")
buffered = io.BytesIO()
im.save(buffered, format='PNG')
width , height = im.size
client = boto3.client('textract')
response = client.analyze_document(
Document={'Bytes': buffered.getvalue()},
FeatureTypes=['TABLES']
)