[BUG] Unsupported mime type application/csv - documents.consumer.ConsumerError
nodecentral opened this issue · 0 comments
Describe the bug
Checking the logs after a batch import (by that I mean the movement of a folder of files into the consume directory for processing via the comer polling option) it was unable a few and this bug is to highlight how it processed two csv files.
Opening the two files up outside of paperless they open and it renders as a csv file no problem
I have checked file mime-type via command line.
[/share/Container/paperless/consume] # file --mime-type -b btbill20090108ref5644200907240737.csv
text/csv
To Reproduce
Example file attached (this was the csv download option provided by the telephone supplier)
btbill20090108ref5644200907240737.csv
Expected behavior
I’m not entirely sure what Paperless would do with csv, i assumed it would use OCR and import without an error
Screenshots
Webserver logs
btbill20090108ref5644200901133355.csv: Unsupported mime type application/csv : Traceback (most recent call last):
File "/usr/src/paperless/src/src/django-q/django_q/cluster.py", line 454, in worker
res = f(*task["args"], **task["kwargs"])
File "/usr/src/paperless/src/documents/tasks.py", line 154, in consume_file
document = Consumer().try_consume_file(
File "/usr/src/paperless/src/documents/consumer.py", line 281, in try_consume_file
self._fail(MESSAGE_UNSUPPORTED_TYPE, f"Unsupported mime type {mime_type}")
File "/usr/src/paperless/src/documents/consumer.py", line 90, in _fail
raise ConsumerError(f"{self.filename}: {log_message or message}") from exception
documents.consumer.ConsumerError: btbill20090108ref5644200901133355.csv: Unsupported mime type application/csv
Relevant information
- Host OS of the machine running paperless: [QNAP Linux]
- Browser [iOS Safari]
- Version [1.9.2]
- Installation method: [Docker]
- Any configuration changes you made : [docker compose below]
version: "3.6"
services:
redis:
image: redis:6.2
container_name: paperless-redis
restart: always
volumes:
- /share/Container/paperlessredis:/data
db:
image: postgres:14
container_name: paperless-db
restart: always
volumes:
- /share/Container/paperlessdb:/var/lib/postgresql/data
environment:
POSTGRES_DB: paperless
POSTGRES_USER: paperless
POSTGRES_PASSWORD: paperless
webserver:
image: ghcr.io/paperless-ngx/paperless-ngx
container_name: paperlessngx
restart: always
privileged: true
depends_on:
- db
- redis
- gotenberg
- tika
ports:
- 8777:8000
volumes:
- /share/Container/paperless/data:/usr/src/paperless/data
- /share/Container/paperless/media:/usr/src/paperless/media
- /share/Container/paperless/export:/usr/src/paperless/export
- /share/Container/paperless/consume:/usr/src/paperless/consume
environment:
PAPERLESS_REDIS: redis://redis:6379
PAPERLESS_DBHOST: db
USERMAP_UID: 1005
USERMAP_GID: 1000
PAPERLESS_TIME_ZONE: Europe/London
PAPERLESS_ADMIN_USER: chris
PAPERLESS_ADMIN_PASSWORD: chrishosting
PAPERLESS_CONSUMER_RECURSIVE: true
PAPERLESS_CONSUMER_SUBDIRS_AS_TAGS: true
PAPERLESS_CONSUMER_POLLING: 5
PAPERLESS_OCR_LANGUAGE: eng
PAPERLESS_TIKA_ENABLED: 1
#PAPERLESS_TIKA_GOTENBERG_ENDPOINT: http://gotenberg:3000/forms/libreoffice/convert#
PAPERLESS_TIKA_GOTENBERG_ENDPOINT: http://gotenberg:3000
PAPERLESS_TIKA_ENDPOINT: http://tika:9998
gotenberg:
image: gotenberg/gotenberg
restart: always
container_name: gotenberg
ports:
- 3044:3000
command:
- "gotenberg"
- "--chromium-disable-routes=true"
tika:
image: ghcr.io/paperless-ngx/tika
container_name: tika
ports:
- 9998:9998
restart: always