openvstorage/alba

Access to object with missing fragments results in `Proxy_protocol.Protocol.Error.ObjectDoesNotExist`

Closed this issue · 3 comments

  1. successful read (stor-04/syslog.1):
stor-04/syslog.1:Feb 15 23:58:33 stor-04 volumedriver_fs.sh[27660]: 2017-02-15 23:58:33 723791 +0100 - stor-04.be-g8-4 - 27660/0x00007f9e277fe700 - volumedriverfs/BackendConnectionInterfaceLogger - 00000000001c504e - info - Logger: Entering read 5e2d0398-588e-4670-aac3-9278d0432480 tlog_2643acb8-3c59-4514-9629-21ea5b68af66
stor-04/syslog.1:Feb 15 23:58:33 stor-04 volumedriver_fs.sh[27660]: 2017-02-15 23:58:33 760468 +0100 - stor-04.be-g8-4 - 27660/0x00007f9e277fe700 - volumedriverfs/BackendConnectionInterfaceLogger - 00000000001c5052 - info - ~Logger: Exiting read for 5e2d0398-588e-4670-aac3-9278d0432480 tlog_2643acb8-3c59-4514-9629-21ea5b68af66
[...]
  1. First failed read (stor-04/syslog) returning ObjectDoesNotExist after detecting missing fragments:
Feb 16 15:16:28 stor-04 volumedriver_fs.sh[27660]: 2017-02-16 15:16:28 169183 +0100 - stor-04.be-g8-4 - 27660/0x00007f9f090bc700 - volumedriverfs/BackendConnectionInterfaceLogger - 0000000000231a0b - info - Logger: Entering read 5e2d0398-588e-4670-aac3-9278d0432480 tlog_2643acb8-3c59-4514-9629-21ea5b68af66
Feb 16 15:16:28 stor-04 alba[7556]: 2017-02-16 15:16:28 170629 +0100 - stor-04.be-g8-4 - 7556/0 - alba/proxy - 10322960 - warning - Detected missing fragment namespace_id=652 object_name=" \000\000\000\185\242c\232\249\255\151\216y\212>;\129\180\251q\200\201:\173~&\205\154\234+\176a\2345\163\005\000\000\000\000\n\000\000\000" object_id="\130\137w\213x\014\165v\20846\"D\018\026\146Q\159x\184\232o\238?\030\220\006\017\029\230X7" osd_id=13 (chunk,fragment,version)=(0,1,0)
[...]
Feb 16 15:16:28 stor-04 alba[7556]: 2017-02-16 15:16:28 174302 +0100 - stor-04.be-g8-4 - 7556/0 - alba/proxy - 10322978 - warning - Detected missing fragment namespace_id=1014 object_name="tlog_2643acb8-3c59-4514-9629-21ea5b68af66" object_id="\185\242c\232\249\255\151\216y\212>;\129\180\251q\200\201:\173~&\205\154\234+\176a\2345\163\005" osd_id=6 (chunk,fragment,version)=(0,10,0)
Feb 16 15:16:28 stor-04 alba[7556]: 2017-02-16 15:16:28 174317 +0100 - stor-04.be-g8-4 - 7556/0 - alba/proxy - 10322979 - warning - could not receive enough fragments for namespace 1014, object "tlog_2643acb8-3c59-4514-9629-21ea5b68af66" ("\185\242c\232\249\255\151\216y\212>;\129\180\251q\200\201:\173~&\205\154\234+\176a\2345\163\005") chunk 0; got 0 while 12 needed
Feb 16 15:16:28 stor-04 alba[7556]: 2017-02-16 15:16:28 174330 +0100 - stor-04.be-g8-4 - 7556/0 - alba/proxy - 10322980 - info - retrying : Alba_client_errors.Error.Exn(8); backtrace:; Raised at file "src/alba_client_errors.ml", line 34, characters 25-32; Called from file "src/alba_client_download.ml", line 385, characters 6-45; Called from file "src/core/lwt.ml", line 653, characters 66-69
Feb 16 15:16:28 stor-04 alba[7556]: 2017-02-16 15:16:28 182652 +0100 - stor-04.be-g8-4 - 7556/0 - alba/proxy - 10322981 - error - Returning Proxy_protocol.Protocol.Error.ObjectDoesNotExist error to client
Feb 16 15:16:28 stor-04 alba[7556]: 2017-02-16 15:16:28 182731 +0100 - stor-04.be-g8-4 - 7556/0 - alba/proxy - 10322982 - error - Request ReadObjectFs ("5e2d0398-588e-4670-aac3-9278d0432480","tlog_2643acb8-3c59-4514-9629-21ea5b68af66",_,_,_) errored and took 0.013453
Feb 16 15:16:28 stor-04 volumedriver_fs.sh[27660]: 2017-02-16 15:16:28 182852 +0100 - stor-04.be-g8-4 - 27660/0x00007f9f090bc700 - volumedriverfs/AlbaConnection - 0000000000231a0c - error - convert_exceptions_: read object: caught Alba proxy exception: Proxy_protocol.Protocol.Error.ObjectDoesNotExist
Feb 16 15:16:28 stor-04 volumedriver_fs.sh[27660]: 2017-02-16 15:16:28 182990 +0100 - stor-04.be-g8-4 - 27660/0x00007f9f090bc700 - volumedriverfs/BackendConnectionInterfaceLogger - 0000000000231a0d - error - ~Logger: Exiting read for 5e2d0398-588e-4670-aac3-9278d0432480 tlog_2643acb8-3c59-4514-9629-21ea5b68af66 with exception
[...]
  1. retries / later attempts run into the ObjectDoesNotExist immediately:
Feb 16 15:16:28 stor-04 volumedriver_fs.sh[27660]: 2017-02-16 15:16:28 183045 +0100 - stor-04.be-g8-4 - 27660/0x00007f9f090bc700 - volumedriverfs/BackendConnectionInterfaceLogger - 0000000000231a10 - info - Logger: Entering read 5e2d0398-588e-4670-aac3-9278d0432480 tlog_2643acb8-3c59-4514-9629-21ea5b68af66
Feb 16 15:16:28 stor-04 alba[7556]: 2017-02-16 15:16:28 183724 +0100 - stor-04.be-g8-4 - 7556/0 - alba/proxy - 10322983 - error - Returning Proxy_protocol.Protocol.Error.ObjectDoesNotExist error to client
Feb 16 15:16:28 stor-04 alba[7556]: 2017-02-16 15:16:28 183757 +0100 - stor-04.be-g8-4 - 7556/0 - alba/proxy - 10322984 - error - Request ReadObjectFs ("5e2d0398-588e-4670-aac3-9278d0432480","tlog_2643acb8-3c59-4514-9629-21ea5b68af66",_,_,_) errored and took 0.000222
Feb 16 15:16:28 stor-04 volumedriver_fs.sh[27660]: 2017-02-16 15:16:28 183781 +0100 - stor-04.be-g8-4 - 27660/0x00007f9f090bc700 - volumedriverfs/AlbaConnection - 0000000000231a11 - error - convert_exceptions_: read object: caught Alba proxy exception: Proxy_protocol.Protocol.Error.ObjectDoesNotExist
Feb 16 15:16:28 stor-04 volumedriver_fs.sh[27660]: 2017-02-16 15:16:28 183872 +0100 - stor-04.be-g8-4 - 27660/0x00007f9f090bc700 - volumedriverfs/BackendConnectionInterfaceLogger - 0000000000231a12 - error - ~Logger: Exiting read for 5e2d0398-588e-4670-aac3-9278d0432480 tlog_2643acb8-3c59-4514-9629-21ea5b68af66 with exception
[...]
  1. maintenance on another node tries to repair (last grep hits in the logs):
stor-01/syslog:Feb 16 15:37:00 stor-01 alba[3212]: 2017-02-16 15:37:00 302902 +0100 - stor-01.be-g8-4 - 3212/0 - alba/maintenance - 45554533 - warning - Repairing object due to bad (missing/corrupted) fragment (
1014, "\185\242c\232\249\255\151\216y\212>;\129\180\251q\200\201:\173~&\205\154\234+\176a\2345\163\005", "tlog_2643acb8-3c59-4514-9629-21ea5b68af66", 0, 10)
stor-01/syslog:Feb 16 15:37:00 stor-01 alba[3212]: 2017-02-16 15:37:00 303508 +0100 - stor-01.be-g8-4 - 3212/0 - alba/maintenance - 45554534 - warning - Detected missing fragment namespace_id=1014 object_name="t
log_2643acb8-3c59-4514-9629-21ea5b68af66" object_id="\185\242c\232\249\255\151\216y\212>;\129\180\251q\200\201:\173~&\205\154\234+\176a\2345\163\005" osd_id=0 (chunk,fragment,version)=(0,10,2)
stor-01/syslog:Feb 16 15:37:00 stor-01 alba[3212]: 2017-02-16 15:37:00 303583 +0100 - stor-01.be-g8-4 - 3212/0 - alba/maintenance - 45554535 - warning - Detected missing fragment namespace_id=1014 object_name="t
log_2643acb8-3c59-4514-9629-21ea5b68af66" object_id="\185\242c\232\249\255\151\216y\212>;\129\180\251q\200\201:\173~&\205\154\234+\176a\2345\163\005" osd_id=93 (chunk,fragment,version)=(0,4,1)
root@stor-04:~# alba version
1.3.4
git_revision: "tags/1.3.4-0-gdabf9dd"
git_repo: "https://github.com/openvstorage/alba.git"
compile_time: "19/01/2017 22:18:42 UTC"

Feedback from @toolslive

I spent a few hours trying to reproduce this, but failed.
So it happened, but we have no evidence, and cannot reproduce it for the moment.

Please re-open if we can reproduce