After system restart, not all containers started

Question

After system restart, not all containers started

KES777 opened this issue 2 years ago · 6 comments

Describe the bug
I restart system, but not all containers started.

To Reproduce

$ reboot
$ docker compose ps
kes@work ~/o/server/monitoring2 $ docker compose ps
NAME                           COMMAND                  SERVICE             STATUS              PORTS
monitoring2-alertmanager-1     "/bin/alertmanager -…"   alertmanager        exited (255)        0.0.0.0:9093->9093/tcp, :::9093->9093/tcp
monitoring2-db-1               "/docker-entrypoint.…"   db                  running             8008/tcp, 0.0.0.0:5432->5432/tcp, :::5432->5432/tcp, 8081/tcp
monitoring2-grafana-1          "/run.sh"                grafana             running             0.0.0.0:3000->3000/tcp, :::3000->3000/tcp
monitoring2-node_exporter-1    "/bin/node_exporter …"   node_exporter       running             0.0.0.0:9100->9100/tcp, :::9100->9100/tcp
monitoring2-otel-collector-1   "/otelcol --config=/…"   otel-collector      exited (0)          
monitoring2-prometheus-1       "/bin/prometheus --c…"   prometheus          exited (255)        0.0.0.0:9090->9090/tcp, :::9090->9090/tcp
monitoring2-promscale-1        "/promscale"             promscale           running             0.0.0.0:9201-9202->9201-9202/tcp, :::9201-9202->9201-9202/tcp

Expected behavior
After manual stop/start all is fine:

kes@work ~/o/server/monitoring2 $ docker compose stop
[+] Running 7/7
 ⠿ Container monitoring2-alertmanager-1    Stopped                                   0.0s
 ⠿ Container monitoring2-otel-collector-1  Stopped                                   0.0s
 ⠿ Container monitoring2-node_exporter-1   Stopped                                   2.2s
 ⠿ Container monitoring2-grafana-1         Stop...                                   1.6s
 ⠿ Container monitoring2-prometheus-1      S...                                      0.0s
 ⠿ Container monitoring2-promscale-1       St...                                     0.4s
 ⠿ Container monitoring2-db-1              Stopped                                  10.2s
kes@work ~/o/server/monitoring2 $ docker compose start
[+] Running 7/7
 ⠿ Container monitoring2-alertmanager-1    Started                                   3.6s
 ⠿ Container monitoring2-node_exporter-1   Started                                   2.3s
 ⠿ Container monitoring2-db-1              Started                                   5.3s
 ⠿ Container monitoring2-otel-collector-1  Started                                   5.1s
 ⠿ Container monitoring2-promscale-1       St...                                     2.1s
 ⠿ Container monitoring2-prometheus-1      S...                                      1.9s
 ⠿ Container monitoring2-grafana-1         Star...                                   3.1s
kes@work ~/o/server/monitoring2 $ docker compose ps
NAME                           COMMAND                  SERVICE             STATUS              PORTS
monitoring2-alertmanager-1     "/bin/alertmanager -…"   alertmanager        running             0.0.0.0:9093->9093/tcp, :::9093->9093/tcp
monitoring2-db-1               "/docker-entrypoint.…"   db                  running             8008/tcp, 0.0.0.0:5432->5432/tcp, :::5432->5432/tcp, 8081/tcp
monitoring2-grafana-1          "/run.sh"                grafana             running             0.0.0.0:3000->3000/tcp, :::3000->3000/tcp
monitoring2-node_exporter-1    "/bin/node_exporter …"   node_exporter       running             0.0.0.0:9100->9100/tcp, :::9100->9100/tcp
monitoring2-otel-collector-1   "/otelcol --config=/…"   otel-collector      running             4317/tcp, 55678-55679/tcp, 0.0.0.0:14268->14268/tcp, :::14268->14268/tcp
monitoring2-prometheus-1       "/bin/prometheus --c…"   prometheus          running             0.0.0.0:9090->9090/tcp, :::9090->9090/tcp
monitoring2-promscale-1        "/promscale"             promscale           running             0.0.0.0:9201-9202->9201-9202/tcp, :::9201-9202->9201-9202/tcp

Configuration (as applicable)
monitoring2.zip

Version

Distribution/OS: Linux Mint 20.3
Promscale: latest
TimescaleDB: latest

Answer 1 · 2022-10-13T05:40:58.000Z

Probably problem belongs to docker service starting. In my case sudo systemctl start docker takes up to 5min.

Answer 2 · 2022-10-13T05:48:15.000Z

Logs from exited containers:

$ docker compose logs -f promscale
ts=2022-10-13T05:31:51.705Z caller=main.go:231 level=info msg="Starting Alertmanager" version="(version=0.24.0, branch=HEAD, revision=f484b17fa3c583ed1b2c8bbcec20ba1db2aa5f11)"
ts=2022-10-13T05:31:51.705Z caller=main.go:232 level=info build_context="(go=go1.17.8, user=root@265f14f5c6fc, date=20220325-09:31:33)"
ts=2022-10-13T05:31:51.706Z caller=cluster.go:185 level=info component=cluster msg="setting advertise address explicitly" addr=192.168.129.67 port=9094
ts=2022-10-13T05:31:51.708Z caller=cluster.go:680 level=info component=cluster msg="Waiting for gossip to settle..." interval=2s
ts=2022-10-13T05:31:51.738Z caller=coordinator.go:113 level=info component=configuration msg="Loading configuration file" file=/etc/alertmanager/alertmanager.yml
ts=2022-10-13T05:31:51.738Z caller=coordinator.go:126 level=info component=configuration msg="Completed loading of configuration file" file=/etc/alertmanager/alertmanager.yml
ts=2022-10-13T05:31:51.741Z caller=main.go:535 level=info msg=Listening address=:9093
ts=2022-10-13T05:31:51.741Z caller=tls_config.go:195 level=info msg="TLS is disabled." http2=false
ts=2022-10-13T05:31:53.708Z caller=cluster.go:705 level=info component=cluster msg="gossip not settled" polls=0 before=0 now=1 elapsed=2.000127686s
ts=2022-10-13T05:32:01.709Z caller=cluster.go:697 level=info component=cluster msg="gossip settled; proceeding" elapsed=10.001595026s
ts=2022-10-13T05:32:43.074Z caller=notify.go:732 level=warn component=dispatcher receiver=web.hook integration=webhook[0] msg="Notify attempt failed, will retry later" attempts=1 err="Post \"http://127.0.0.1:5001/\": dial tcp 127.0.0.1:5001: connect: connection refused"
ts=2022-10-13T05:37:43.074Z caller=dispatch.go:354 level=error component=dispatcher msg="Notify for alerts failed" num_alerts=1 err="web.hook/webhook[0]: notify retry canceled after 17 attempts: Post \"http://127.0.0.1:5001/\": dial tcp 127.0.0.1:5001: connect: connection refused"
ts=2022-10-13T05:37:43.074Z caller=notify.go:732 level=warn component=dispatcher receiver=web.hook integration=webhook[0] msg="Notify attempt failed, will retry later" attempts=1 err="Post \"http://127.0.0.1:5001/\": dial tcp 127.0.0.1:5001: connect: connection refused"
ts=2022-10-13T05:42:43.074Z caller=dispatch.go:354 level=error component=dispatcher msg="Notify for alerts failed" num_alerts=1 err="web.hook/webhook[0]: notify retry canceled after 18 attempts: Post \"http://127.0.0.1:5001/\": dial tcp 127.0.0.1:5001: connect: connection refused"
ts=2022-10-13T05:42:43.075Z caller=notify.go:732 level=warn component=dispatcher receiver=web.hook integration=webhook[0] msg="Notify attempt failed, will retry later" attempts=1 err="Post \"http://127.0.0.1:5001/\": dial tcp 127.0.0.1:5001: connect: connection refused"

$ docker logs monitoring2-otel-collector-1
2022-10-13T05:31:54.521Z	info	service/telemetry.go:102	Setting up own telemetry...
2022-10-13T05:31:54.521Z	info	service/telemetry.go:137	Serving Prometheus metrics	{"address": ":8888", "level": "basic"}
2022-10-13T05:31:54.521Z	debug	components/components.go:28	Stable component.{"kind": "exporter", "data_type": "traces", "name": "otlp", "stability": "stable"}
2022-10-13T05:31:54.521Z	info	components/components.go:30	In development component. May change in the future.	{"kind": "exporter", "data_type": "traces", "name": "logging", "stability": "in development"}
2022-10-13T05:31:54.521Z	debug	components/components.go:28	Stable component.{"kind": "processor", "name": "batch", "pipeline": "traces", "stability": "stable"}
2022-10-13T05:31:54.521Z	debug	components/components.go:28	Stable component.{"kind": "receiver", "name": "otlp", "pipeline": "traces", "stability": "stable"}
2022-10-13T05:31:54.543Z	info	extensions/extensions.go:42	Starting extensions...
2022-10-13T05:31:54.543Z	info	pipelines/pipelines.go:74	Starting exporters...
2022-10-13T05:31:54.543Z	info	pipelines/pipelines.go:78	Exporter is starting...	{"kind": "exporter", "data_type": "traces", "name": "otlp"}
2022-10-13T05:31:54.543Z	info	zapgrpc/zapgrpc.go:174	[core] [Channel #1] Channel created	{"grpc_log": true}
2022-10-13T05:31:54.543Z	info	zapgrpc/zapgrpc.go:174	[core] [Channel #1] original dial target is: "promscale:9202"	{"grpc_log": true}
2022-10-13T05:31:54.543Z	info	zapgrpc/zapgrpc.go:174	[core] [Channel #1] parsed dial target is: {Scheme:promscale Authority: Endpoint:9202 URL:{Scheme:promscale Opaque:9202 User: Host: Path: RawPath: ForceQuery:false RawQuery: Fragment: RawFragment:}}	{"grpc_log": true}
2022-10-13T05:31:54.543Z	info	zapgrpc/zapgrpc.go:174	[core] [Channel #1] fallback to scheme "passthrough"	{"grpc_log": true}
2022-10-13T05:31:54.543Z	info	zapgrpc/zapgrpc.go:174	[core] [Channel #1] parsed dial target is: {Scheme:passthrough Authority: Endpoint:promscale:9202 URL:{Scheme:passthrough Opaque: User: Host: Path:/promscale:9202 RawPath: ForceQuery:false RawQuery: Fragment: RawFragment:}}	{"grpc_log": true}
2022-10-13T05:31:54.543Z	info	zapgrpc/zapgrpc.go:174	[core] [Channel #1] Channel authority set to "promscale:9202"	{"grpc_log": true}
2022-10-13T05:31:54.543Z	info	zapgrpc/zapgrpc.go:174	[core] [Channel #1] Resolver state updated: {
  "Addresses": [
    {
      "Addr": "promscale:9202",
      "ServerName": "",
      "Attributes": null,
      "BalancerAttributes": null,
      "Type": 0,
      "Metadata": null
    }
  ],
  "ServiceConfig": null,
  "Attributes": null
} (resolver returned new addresses)	{"grpc_log": true}
2022-10-13T05:31:54.543Z	info	zapgrpc/zapgrpc.go:174	[core] [Channel #1] Channel switches to new LB policy "pick_first"	{"grpc_log": true}
2022-10-13T05:31:54.543Z	info	zapgrpc/zapgrpc.go:174	[core] [Channel #1 SubChannel #2] Subchannel created	{"grpc_log": true}
2022-10-13T05:31:54.543Z	info	zapgrpc/zapgrpc.go:174	[core] [Channel #1 SubChannel #2] Subchannel Connectivity change to CONNECTING	{"grpc_log": true}
2022-10-13T05:31:54.543Z	info	pipelines/pipelines.go:82	Exporter started.{"kind": "exporter", "data_type": "traces", "name": "otlp"}
2022-10-13T05:31:54.543Z	info	pipelines/pipelines.go:78	Exporter is starting...	{"kind": "exporter", "data_type": "traces", "name": "logging"}
2022-10-13T05:31:54.543Z	info	zapgrpc/zapgrpc.go:174	[core] [Channel #1 SubChannel #2] Subchannel picks a new address "promscale:9202" to connect	{"grpc_log": true}
2022-10-13T05:31:54.544Z	info	pipelines/pipelines.go:82	Exporter started.{"kind": "exporter", "data_type": "traces", "name": "logging"}
2022-10-13T05:31:54.544Z	info	pipelines/pipelines.go:86	Starting processors...
2022-10-13T05:31:54.544Z	info	pipelines/pipelines.go:90	Processor is starting...	{"kind": "processor", "name": "batch", "pipeline": "traces"}
2022-10-13T05:31:54.544Z	info	pipelines/pipelines.go:94	Processor started.{"kind": "processor", "name": "batch", "pipeline": "traces"}
2022-10-13T05:31:54.544Z	info	pipelines/pipelines.go:98	Starting receivers...
2022-10-13T05:31:54.544Z	info	pipelines/pipelines.go:102	Receiver is starting...	{"kind": "receiver", "name": "otlp", "pipeline": "traces"}
2022-10-13T05:31:54.544Z	info	zapgrpc/zapgrpc.go:174	[core] [Server #3] Server created	{"grpc_log": true}
2022-10-13T05:31:54.544Z	info	otlpreceiver/otlp.go:70	Starting GRPC server on endpoint 0.0.0.0:4317	{"kind": "receiver", "name": "otlp", "pipeline": "traces"}
2022-10-13T05:31:54.544Z	info	zapgrpc/zapgrpc.go:174	[core] pickfirstBalancer: UpdateSubConnState: 0xc0003cb0e0, {CONNECTING <nil>}	{"grpc_log": true}
2022-10-13T05:31:54.544Z	info	zapgrpc/zapgrpc.go:174	[core] [Channel #1] Channel Connectivity change to CONNECTING	{"grpc_log": true}
2022-10-13T05:31:54.544Z	info	otlpreceiver/otlp.go:88	Starting HTTP server on endpoint 0.0.0.0:4318	{"kind": "receiver", "name": "otlp", "pipeline": "traces"}
2022-10-13T05:31:54.544Z	info	pipelines/pipelines.go:106	Receiver started.{"kind": "receiver", "name": "otlp", "pipeline": "traces"}
2022-10-13T05:31:54.544Z	info	service/collector.go:215	Starting otelcol...	{"Version": "0.56.0", "NumCPU": 8}
2022-10-13T05:31:54.544Z	info	service/collector.go:128	Everything is ready. Begin running and processing data.
2022-10-13T05:31:54.544Z	info	zapgrpc/zapgrpc.go:174	[core] [Server #3 ListenSocket #4] ListenSocket created	{"grpc_log": true}
2022-10-13T05:31:55.522Z	warn	zapgrpc/zapgrpc.go:191	[core] [Channel #1 SubChannel #2] grpc: addrConn.createTransport failed to connect to {
  "Addr": "promscale:9202",
  "ServerName": "promscale:9202",
  "Attributes": null,
  "BalancerAttributes": null,
  "Type": 0,
  "Metadata": null
}. Err: connection error: desc = "transport: Error while dialing dial tcp 192.168.129.70:9202: connect: connection refused"	{"grpc_log": true}
2022-10-13T05:31:55.522Z	info	zapgrpc/zapgrpc.go:174	[core] [Channel #1 SubChannel #2] Subchannel Connectivity change to TRANSIENT_FAILURE	{"grpc_log": true}
2022-10-13T05:31:55.522Z	info	zapgrpc/zapgrpc.go:174	[core] pickfirstBalancer: UpdateSubConnState: 0xc0003cb0e0, {TRANSIENT_FAILURE connection error: desc = "transport: Error while dialing dial tcp 192.168.129.70:9202: connect: connection refused"}	{"grpc_log": true}
2022-10-13T05:31:55.522Z	info	zapgrpc/zapgrpc.go:174	[core] [Channel #1] Channel Connectivity change to TRANSIENT_FAILURE	{"grpc_log": true}
2022-10-13T05:31:56.522Z	info	zapgrpc/zapgrpc.go:174	[core] [Channel #1 SubChannel #2] Subchannel Connectivity change to IDLE	{"grpc_log": true}
2022-10-13T05:31:56.522Z	info	zapgrpc/zapgrpc.go:174	[core] pickfirstBalancer: UpdateSubConnState: 0xc0003cb0e0, {IDLE connection error: desc = "transport: Error while dialing dial tcp 192.168.129.70:9202: connect: connection refused"}	{"grpc_log": true}
2022-10-13T05:31:56.522Z	info	zapgrpc/zapgrpc.go:174	[core] [Channel #1] Channel Connectivity change to IDLE	{"grpc_log": true}
2022-10-13T05:32:10.568Z	info	TracesExporter	{"kind": "exporter", "data_type": "traces", "name": "logging", "#spans": 18}
2022-10-13T05:32:10.568Z	info	zapgrpc/zapgrpc.go:174	[core] [Channel #1 SubChannel #2] Subchannel Connectivity change to CONNECTING	{"grpc_log": true}
2022-10-13T05:32:10.568Z	info	zapgrpc/zapgrpc.go:174	[core] [Channel #1 SubChannel #2] Subchannel picks a new address "promscale:9202" to connect	{"grpc_log": true}
2022-10-13T05:32:10.569Z	info	zapgrpc/zapgrpc.go:174	[core] pickfirstBalancer: UpdateSubConnState: 0xc0003cb0e0, {CONNECTING <nil>}	{"grpc_log": true}
2022-10-13T05:32:10.569Z	info	zapgrpc/zapgrpc.go:174	[core] [Channel #1] Channel Connectivity change to CONNECTING	{"grpc_log": true}
2022-10-13T05:32:10.570Z	info	zapgrpc/zapgrpc.go:174	[core] [Channel #1 SubChannel #2] Subchannel Connectivity change to READY	{"grpc_log": true}
2022-10-13T05:32:10.570Z	info	zapgrpc/zapgrpc.go:174	[core] pickfirstBalancer: UpdateSubConnState: 0xc0003cb0e0, {READY <nil>}	{"grpc_log": true}
2022-10-13T05:32:10.570Z	info	zapgrpc/zapgrpc.go:174	[core] [Channel #1] Channel Connectivity change to READY	{"grpc_log": true}
2022-10-13T05:32:15.576Z	info	TracesExporter	{"kind": "exporter", "data_type": "traces", "name": "logging", "#spans": 1}
2022-10-13T05:32:20.584Z	info	TracesExporter	{"kind": "exporter", "data_type": "traces", "name": "logging", "#spans": 1}
2022-10-13T05:32:25.592Z	info	TracesExporter	{"kind": "exporter", "data_type": "traces", "name": "logging", "#spans": 142}
2022-10-13T05:32:30.601Z	info	TracesExporter	{"kind": "exporter", "data_type": "traces", "name": "logging", "#spans": 1}
2022-10-13T05:32:35.610Z	info	TracesExporter	{"kind": "exporter", "data_type": "traces", "name": "logging", "#spans": 1}
2022-10-13T05:32:40.617Z	info	TracesExporter	{"kind": "exporter", "data_type": "traces", "name": "logging", "#spans": 4}

$ docker logs monitoring2-prometheus-1
ts=2022-10-13T05:31:59.034Z caller=main.go:499 level=info msg="No time or size retention was set so using the default time retention" duration=15d
ts=2022-10-13T05:31:59.034Z caller=main.go:543 level=info msg="Starting Prometheus Server" mode=server version="(version=2.39.1, branch=HEAD, revision=dcd6af9e0d56165c6f5c64ebbc1fae798d24933a)"
ts=2022-10-13T05:31:59.034Z caller=main.go:548 level=info build_context="(go=go1.19.2, user=root@273d60c69592, date=20221007-15:57:09)"
ts=2022-10-13T05:31:59.035Z caller=main.go:549 level=info host_details="(Linux 5.18.10-051810-generic #202207091532-Ubuntu SMP PREEMPT_DYNAMIC Sat Jul 9 15:55:09 UTC  x86_64 452af5bed03c (none))"
ts=2022-10-13T05:31:59.035Z caller=main.go:550 level=info fd_limits="(soft=1048576, hard=1048576)"
ts=2022-10-13T05:31:59.035Z caller=main.go:551 level=info vm_limits="(soft=unlimited, hard=unlimited)"
ts=2022-10-13T05:31:59.078Z caller=web.go:559 level=info component=web msg="Start listening for connections" address=0.0.0.0:9090
ts=2022-10-13T05:31:59.079Z caller=main.go:980 level=info msg="Starting TSDB ..."
ts=2022-10-13T05:31:59.080Z caller=tls_config.go:195 level=info component=web msg="TLS is disabled." http2=false
ts=2022-10-13T05:31:59.127Z caller=head.go:551 level=info component=tsdb msg="Replaying on-disk memory mappable chunks if any"
ts=2022-10-13T05:31:59.161Z caller=head.go:595 level=info component=tsdb msg="On-disk memory mappable chunks replay completed" duration=34.231158ms
ts=2022-10-13T05:31:59.162Z caller=head.go:601 level=info component=tsdb msg="Replaying WAL, this may take a while"
ts=2022-10-13T05:32:00.684Z caller=head.go:672 level=info component=tsdb msg="WAL segment loaded" segment=0 maxSegment=1
ts=2022-10-13T05:32:00.685Z caller=head.go:672 level=info component=tsdb msg="WAL segment loaded" segment=1 maxSegment=1
ts=2022-10-13T05:32:00.685Z caller=head.go:709 level=info component=tsdb msg="WAL replay completed" checkpoint_replay_duration=51.418µs wal_replay_duration=1.523174312s wbl_replay_duration=284ns total_replay_duration=1.55749006s
ts=2022-10-13T05:32:00.688Z caller=main.go:1001 level=info fs_type=9123683e
ts=2022-10-13T05:32:00.688Z caller=main.go:1004 level=info msg="TSDB started"
ts=2022-10-13T05:32:00.688Z caller=main.go:1184 level=info msg="Loading configuration file" filename=/etc/prometheus/prometheus.yml
ts=2022-10-13T05:32:00.688Z caller=dedupe.go:112 component=remote level=info remote_name=036793 url=http://promscale:9201/write msg="Starting WAL watcher" queue=036793
ts=2022-10-13T05:32:00.688Z caller=dedupe.go:112 component=remote level=info remote_name=036793 url=http://promscale:9201/write msg="Starting scraped metadata watcher"
ts=2022-10-13T05:32:00.689Z caller=dedupe.go:112 component=remote level=info remote_name=036793 url=http://promscale:9201/write msg="Replaying WAL" queue=036793
ts=2022-10-13T05:32:00.689Z caller=main.go:1221 level=info msg="Completed loading of configuration file" filename=/etc/prometheus/prometheus.yml totalDuration=1.166861ms db_storage=1.524µs remote_storage=599.125µs web_handler=590ns query_engine=1.35µs scrape=222.539µs scrape_sd=53.897µs notify=3.439µs notify_sd=1.996µs rules=1.591µs tracing=3.255µs
ts=2022-10-13T05:32:00.689Z caller=main.go:965 level=info msg="Server is ready to receive web requests."
ts=2022-10-13T05:32:00.689Z caller=manager.go:943 level=info component="rule manager" msg="Starting rule manager..."
ts=2022-10-13T05:32:08.364Z caller=dedupe.go:112 component=remote level=info remote_name=036793 url=http://promscale:9201/write msg="Done replaying WAL" duration=7.675696513s
ts=2022-10-13T05:32:11.241Z caller=compact.go:519 level=info component=tsdb msg="write block" mint=1665587073423 maxt=1665590400000 ulid=01GF7X87C9VQ6AH2N9DVRKVFK2 duration=2.784734429s
ts=2022-10-13T05:32:11.245Z caller=head.go:1192 level=info component=tsdb msg="Head GC completed" caller=truncateMemory duration=2.336689ms
ts=2022-10-13T05:32:13.936Z caller=compact.go:519 level=info component=tsdb msg="write block" mint=1665590401457 maxt=1665597600000 ulid=01GF7X8A3D8YM89TJJPCPKKF7M duration=2.691490605s
ts=2022-10-13T05:32:13.940Z caller=head.go:1192 level=info component=tsdb msg="Head GC completed" caller=truncateMemory duration=3.269452ms
ts=2022-10-13T05:32:30.690Z caller=dedupe.go:112 component=remote level=info remote_name=036793 url=http://promscale:9201/write msg="Remote storage resharding" from=1 to=9
ts=2022-10-13T05:32:40.689Z caller=dedupe.go:112 component=remote level=warn remote_name=036793 url=http://promscale:9201/write msg="Skipping resharding, last successful send was beyond threshold" lastSendTimestamp=1665639143 minSendTimestamp=1665639150
ts=2022-10-13T05:32:50.689Z caller=dedupe.go:112 component=remote level=info remote_name=036793 url=http://promscale:9201/write msg="Currently resharding, skipping."
ts=2022-10-13T05:33:20.689Z caller=dedupe.go:112 component=remote level=warn remote_name=036793 url=http://promscale:9201/write msg="Skipping resharding, last successful send was beyond threshold" lastSendTimestamp=1665639182 minSendTimestamp=1665639190
ts=2022-10-13T05:33:40.689Z caller=dedupe.go:112 component=remote level=info remote_name=036793 url=http://promscale:9201/write msg="Remote storage resharding" from=9 to=1

Answer 3 · 2022-11-03T06:57:52.000Z

@KES777 Is this still an issue? Looks like the containers are unable to reach the dependent ones. Maybe some network issue from the docker daemon front.

Answer 4 · 2022-11-03T16:29:36.000Z

Just tried. Yes, this is still an issue.

This is one stack. So I do not think this should be network issue.

But after server boot, if I restart stack once more,

then everything works fine:

Digging a bit found, that I required to add restart: unless-stopped. Then all services are started.

May you please adjust docker-compose.yml?

Answer 5 · 2022-11-22T16:24:01.000Z

Unless-stopped restarts the container only when any user executes a command to stop the container, not when it fails because of an error. This isn't the right restart policy to add. I will include restart: on-failure. This will make sure to restart if the container crashes.

Unless-stopped works as an anti-pattern when the user wants to stop the container it restarts again.

Answer 6 · 2022-12-29T02:39:28.000Z

@KES777 Is this still an issue? Closing the issue for now. Feel free to re-open the issue if you are still facing the issue with docker-compose.