openvstorage/alba

Alba proxy Too many open files" accept error exit 1 Failure

Closed this issue · 1 comments

I'm tryting to reproduce issue #607 and got following errors.

setup:
Removed all asds from 1 node (policy still satisfied no issues).
When removing an extra asd (policy is not satisfied anymore) i received following errors in the proxies.

The Alba proxy restart each time when my fio test is running. ( The fio test doesn't like this and stopped the process)

@toolslive Jan told me you saw this issues also this morning?

proxy logs

2017-02-09 13:25:30 607629 +0100 - vol-221-1 - 31631/0 - alba/proxy - 14375 - warning - Failed to connect to (10.100.199.223,26408): (Unix.Unix_error "Too many open files" socket ""); backtrace:; Raised by primitive operation at file "src/unix/lwt_unix.ml", line 1334, characters 10-35; Called from file "src/client/client_helper.ml", line 51, characters 11-74; Called from file "src/core/lwt.ml", line 686, characters 20-24
2017-02-09 13:25:30 690545 +0100 - vol-221-1 - 31631/0 - alba/proxy - 14376 - warning - Failed to connect to (10.100.199.221,26408): (Unix.Unix_error "Too many open files" socket ""); backtrace:; Raised by primitive operation at file "src/unix/lwt_unix.ml", line 1334, characters 10-35; Called from file "src/client/client_helper.ml", line 51, characters 11-74; Called from file "src/core/lwt.ml", line 686, characters 20-24
2017-02-09 13:25:30 690607 +0100 - vol-221-1 - 31631/0 - alba/proxy - 14377 - warning - Failed to connect to (10.100.199.222,26408): (Unix.Unix_error "Too many open files" socket ""); backtrace:; Raised by primitive operation at file "src/unix/lwt_unix.ml", line 1334, characters 10-35; Called from file "src/client/client_helper.ml", line 51, characters 11-74; Called from file "src/core/lwt.ml", line 686, characters 20-24
2017-02-09 13:25:30 690665 +0100 - vol-221-1 - 31631/0 - alba/proxy - 14378 - warning - Failed to connect to (10.100.199.223,26408): (Unix.Unix_error "Too many open files" socket ""); backtrace:; Raised by primitive operation at file "src/unix/lwt_unix.ml", line 1334, characters 10-35; Called from file "src/client/client_helper.ml", line 51, characters 11-74; Called from file "src/core/lwt.ml", line 686, characters 20-24
2017-02-09 13:25:30 690700 +0100 - vol-221-1 - 31631/0 - alba/proxy - 14379 - info - Alba_client: create_namespace "33a6eccf-0cbe-4b2c-a93c-d92e4900af81_000000058"
2017-02-09 13:25:30 690754 +0100 - vol-221-1 - 31631/0 - alba/proxy - 14380 - warning - Failed to connect to (10.100.199.221,26408): (Unix.Unix_error "Too many open files" socket ""); backtrace:; Raised by primitive operation at file "src/unix/lwt_unix.ml", line 1334, characters 10-35; Called from file "src/client/client_helper.ml", line 51, characters 11-74; Called from file "src/core/lwt.ml", line 686, characters 20-24
2017-02-09 13:25:30 691200 +0100 - vol-221-1 - 31631/0 - alba/proxy - 14381 - warning - Failed to connect to (10.100.199.222,26408): (Unix.Unix_error "Too many open files" socket ""); backtrace:; Raised by primitive operation at file "src/unix/lwt_unix.ml", line 1334, characters 10-35; Called from file "src/client/client_helper.ml", line 51, characters 11-74; Called from file "src/core/lwt.ml", line 686, characters 20-24
2017-02-09 13:25:30 691258 +0100 - vol-221-1 - 31631/0 - alba/proxy - 14382 - warning - Failed to connect to (10.100.199.223,26408): (Unix.Unix_error "Too many open files" socket ""); backtrace:; Raised by primitive operation at file "src/unix/lwt_unix.ml", line 1334, characters 10-35; Called from file "src/client/client_helper.ml", line 51, characters 11-74; Called from file "src/core/lwt.ml", line 686, characters 20-24
2017-02-09 13:25:30 691284 +0100 - vol-221-1 - 31631/0 - alba/proxy - 14383 - info - Exception while adding object to alba fragment cache: Client_helper.MasterLookupResult.Error(1); backtrace:; Raised at file "queue.ml", line 68, characters 17-22; Called from file "src/tools/lwt_pool2.ml", line 98, characters 25-46
2017-02-09 13:25:30 700394 +0100 - vol-221-1 - 31631/0 - alba/proxy - 14384 - error - Unexpected exception in proxy while handling request: (Unix.Unix_error "Too many open files" open;   /mnt/hdd2/ovssup-196_write_sco_1/8fa672af-054e-462f-aa9d-ba6e6e9566bc/00_000019eb_00); backtrace:; Raised at file "queue.ml", line 68, characters 17-22; Called from file "src/tools/lwt_pool2.ml", line 98, characters 25-46
2017-02-09 13:25:30 700507 +0100 - vol-221-1 - 31631/0 - alba/proxy - 14385 - error - Request ApplySequence ("8fa672af-054e-462f-aa9d-ba6e6e9566bc",false,[(Nsm_model.Assert.ObjectHasChecksum ("owner_tag",;     Sha1 356a192b7913b04c54574d18c28d46e6395428ab));   ],[(Proxy_protocol.Protocol.Update.UploadObjectFromFile;     ("00_000019eb_00",;      "/mnt/hdd2/ovssup-196_write_sco_1/8fa672af-054e-462f-aa9d-ba6e6e9566bc/00_000019eb_00",;      (Some Crc32c 0x33d5f98a)));   ]) errored and took 0.000525
2017-02-09 13:25:30 701346 +0100 - vol-221-1 - 31631/0 - alba/proxy - 14386 - info - Closing listening socket on port 26210
2017-02-09 13:25:30 701559 +0100 - vol-221-1 - 31631/0 - alba/proxy - 14387 - info - server for ADDR_INET(172.22.199.226,26210) going down: (Unix.Unix_error "Too many open files" accept "")
2017-02-09 13:25:30 701631 +0100 - vol-221-1 - 31631/0 - alba/proxy - 14388 - fatal - Going down after unexpected exception in proxy process: (Unix.Unix_error "Too many open files" accept ""); backtrace:; Raised at file "bytes.ml", line 219, characters 25-34; Called from file "string.ml", line 106, characters 2-26
alba: internal error, uncaught exception:
      (Unix.Unix_error "Too many open files" accept "")
      Raised at file "src/core/lwt.ml", line 789, characters 22-23
      Called from file "src/unix/lwt_main.ml", line 34, characters 8-18
      Called from file "src/cmdliner.ml", line 1350, characters 17-26
      Called from file "src/cmdliner.ml", line 1390, characters 6-34
ovs-albaproxy_ovssup-196_0.service: Main process exited, code=exited, status=1/FAILURE
ovs-albaproxy_ovssup-196_0.service: Unit entered failed state.
ovs-albaproxy_ovssup-196_0.service: Failed with result 'exit-code'.
ovs-albaproxy_ovssup-196_0.service: Service hold-off time over, scheduling restart.
Stopped ALBA proxy.
Starting ALBA proxy...
Started ALBA proxy.

Yes, and it's fixed by 3d88643