[BUG] After migration from paperless, "Checksum mismatch" prevents adding new files
danieldietsch opened this issue · 0 comments
Describe the bug
After bare-metal install of paperless-ng, copy media/ and data/ from old paperless installation, ran
python manage.py migrate followed by python manage.py document_index reindex without error.
Then ran python manage.py document_sanity_checker and received many errors of the form
...
[2022-02-26 23:56:46,981] [ERROR] [paperless.sanity_checker] Checksum mismatch of document 141. Stored: 8c84a88c095f6c4491d588666fef36b2, actual: af8807e9262235b101a4c42dc4e6c1e8.
...
Actual md5sum is indeed af8807e9262235b101a4c42dc4e6c1e8 for media/documents/originals/0000141.pdf
Viewing and modifying meta-data works, but when I want to add a new file, the task scheduler reports
[2022-02-27 00:11:24,269] [INFO] [paperless.consumer] Document 2022-01-30 ... consumption finished
00:11:24 [Q] INFO Process-1:1 stopped doing work
00:11:24 [Q] INFO Processed [scan_flachbett.pdf]
[2022-02-27 00:11:24,297] [ERROR] [paperless.consumer] The following error occured while consuming scan_flachbett.pdf: UNIQUE constraint failed: documents_document.checksum
Traceback (most recent call last):
File "<dir>/venv/lib/python3.9/site-packages/django/db/backends/utils.py", line 84, in _execute
return self.cursor.execute(sql, params)
File "<dir>/venv/lib/python3.9/site-packages/django/db/backends/sqlite3/base.py", line 423, in execute
return Database.Cursor.execute(self, query, params)
sqlite3.IntegrityError: UNIQUE constraint failed: documents_document.checksum
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "<dir>/paperless-ng/src/documents/consumer.py", line 287, in try_consume_file
document = self._store(
File "<dir>/paperless-ng/src/documents/consumer.py", line 382, in _store
document = Document.objects.create(
File "<dir>/venv/lib/python3.9/site-packages/django/db/models/manager.py", line 85, in manager_method
return getattr(self.get_queryset(), name)(*args, **kwargs)
File "<dir>/venv/lib/python3.9/site-packages/django/db/models/query.py", line 453, in create
obj.save(force_insert=True, using=self.db)
File "<dir>/venv/lib/python3.9/site-packages/django/db/models/base.py", line 726, in save
self.save_base(using=using, force_insert=force_insert,
File "<dir>/venv/lib/python3.9/site-packages/django/db/models/base.py", line 763, in save_base
updated = self._save_table(
File "<dir>/venv/lib/python3.9/site-packages/django/db/models/base.py", line 868, in _save_table
results = self._do_insert(cls._base_manager, using, fields, returning_fields, raw)
File "<dir>/venv/lib/python3.9/site-packages/django/db/models/base.py", line 906, in _do_insert
return manager._insert(
File "<dir>/venv/lib/python3.9/site-packages/django/db/models/manager.py", line 85, in manager_method
return getattr(self.get_queryset(), name)(*args, **kwargs)
File "<dir>/venv/lib/python3.9/site-packages/django/db/models/query.py", line 1270, in _insert
return query.get_compiler(using=using).execute_sql(returning_fields)
File "<dir>/venv/lib/python3.9/site-packages/django/db/models/sql/compiler.py", line 1416, in execute_sql
cursor.execute(sql, params)
File "<dir>/venv/lib/python3.9/site-packages/django/db/backends/utils.py", line 98, in execute
return super().execute(sql, params)
File "<dir>/venv/lib/python3.9/site-packages/django/db/backends/utils.py", line 66, in execute
return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
File "<dir>/venv/lib/python3.9/site-packages/django/db/backends/utils.py", line 75, in _execute_with_wrappers
return executor(sql, params, many, context)
File "<dir>/venv/lib/python3.9/site-packages/django/db/backends/utils.py", line 84, in _execute
return self.cursor.execute(sql, params)
File "<dir>/venv/lib/python3.9/site-packages/django/db/utils.py", line 90, in __exit__
raise dj_exc_value.with_traceback(traceback) from exc_value
File "<dir>/venv/lib/python3.9/site-packages/django/db/backends/utils.py", line 84, in _execute
return self.cursor.execute(sql, params)
File "<dir>/venv/lib/python3.9/site-packages/django/db/backends/sqlite3/base.py", line 423, in execute
return Database.Cursor.execute(self, query, params)
django.db.utils.IntegrityError: UNIQUE constraint failed: documents_document.checksum
[2022-02-27 00:11:24,303] [DEBUG] [paperless.parsing.tesseract] Deleting directory /tmp/paperless/paperless-1ljxtjms
00:11:24 [Q] INFO Process-1:2 stopped doing work
00:11:24 [Q] ERROR Failed [scan_flachbett.pdf] - scan_flachbett.pdf: The following error occured while consuming scan_flachbett.pdf: UNIQUE constraint failed: documents_document.checksum : Traceback (most recent call last):
File "<dir>/venv/lib/python3.9/site-packages/django/db/backends/utils.py", line 84, in _execute
return self.cursor.execute(sql, params)
File "<dir>/venv/lib/python3.9/site-packages/django/db/backends/sqlite3/base.py", line 423, in execute
return Database.Cursor.execute(self, query, params)
sqlite3.IntegrityError: UNIQUE constraint failed: documents_document.checksum
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "<dir>/venv/lib/python3.9/site-packages/asgiref/sync.py", line 288, in main_wrap
raise exc_info[1]
File "<dir>/paperless-ng/src/documents/consumer.py", line 287, in try_consume_file
document = self._store(
File "<dir>/paperless-ng/src/documents/consumer.py", line 382, in _store
document = Document.objects.create(
File "<dir>/venv/lib/python3.9/site-packages/django/db/models/manager.py", line 85, in manager_method
return getattr(self.get_queryset(), name)(*args, **kwargs)
File "<dir>/venv/lib/python3.9/site-packages/django/db/models/query.py", line 453, in create
obj.save(force_insert=True, using=self.db)
File "<dir>/venv/lib/python3.9/site-packages/django/db/models/base.py", line 726, in save
self.save_base(using=using, force_insert=force_insert,
File "<dir>/venv/lib/python3.9/site-packages/django/db/models/base.py", line 763, in save_base
updated = self._save_table(
File "<dir>/venv/lib/python3.9/site-packages/django/db/models/base.py", line 868, in _save_table
results = self._do_insert(cls._base_manager, using, fields, returning_fields, raw)
File "<dir>/venv/lib/python3.9/site-packages/django/db/models/base.py", line 906, in _do_insert
return manager._insert(
File "<dir>/venv/lib/python3.9/site-packages/django/db/models/manager.py", line 85, in manager_method
return getattr(self.get_queryset(), name)(*args, **kwargs)
File "<dir>/venv/lib/python3.9/site-packages/django/db/models/query.py", line 1270, in _insert
return query.get_compiler(using=using).execute_sql(returning_fields)
File "<dir>/venv/lib/python3.9/site-packages/django/db/models/sql/compiler.py", line 1416, in execute_sql
cursor.execute(sql, params)
File "<dir>/venv/lib/python3.9/site-packages/django/db/backends/utils.py", line 98, in execute
return super().execute(sql, params)
File "<dir>/venv/lib/python3.9/site-packages/django/db/backends/utils.py", line 66, in execute
return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
File "<dir>/venv/lib/python3.9/site-packages/django/db/backends/utils.py", line 75, in _execute_with_wrappers
return executor(sql, params, many, context)
File "<dir>/venv/lib/python3.9/site-packages/django/db/backends/utils.py", line 84, in _execute
return self.cursor.execute(sql, params)
File "<dir>/venv/lib/python3.9/site-packages/django/db/utils.py", line 90, in __exit__
raise dj_exc_value.with_traceback(traceback) from exc_value
File "<dir>/venv/lib/python3.9/site-packages/django/db/backends/utils.py", line 84, in _execute
return self.cursor.execute(sql, params)
File "<dir>/venv/lib/python3.9/site-packages/django/db/backends/sqlite3/base.py", line 423, in execute
return Database.Cursor.execute(self, query, params)
django.db.utils.IntegrityError: UNIQUE constraint failed: documents_document.checksum
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<dir>/venv/lib/python3.9/site-packages/django_q/cluster.py", line 432, in worker
res = f(*task["args"], **task["kwargs"])
File "<dir>/paperless-ng/src/documents/tasks.py", line 74, in consume_file
document = Consumer().try_consume_file(
File "<dir>/paperless-ng/src/documents/consumer.py", line 346, in try_consume_file
self._fail(
File "<dir>/paperless-ng/src/documents/consumer.py", line 70, in _fail
raise ConsumerError(f"{self.filename}: {log_message or message}")
documents.consumer.ConsumerError: scan_flachbett.pdf: The following error occured while consuming scan_flachbett.pdf: UNIQUE constraint failed: documents_document.checksum
and does not add the file.
Expected behavior
No checksum error or method to recompute checksums during migration (I guess the old paperless did not have the UNIQUE constraint on checksums?).
I am not sure if this is actually a bug, but as paperless still works without issue, it might be a thing that could be avoided.