Containerise small services

Question

Containerise small services

Opened this issue 2 years ago · 21 comments

grischard commented 2 years ago

There's a number of small services that currently live on a full server, and could live inside a container.

Identify small services that could be containerised
Identify server to host containers
Choose container system
Write container files

Answer 1 · 2022-12-09T17:19:59.000Z

I am going to test OSQA as a container first.

Answer 2 · 2022-12-09T17:22:03.000Z

Yes sure why not pick literally the hardest possible thing to try first.

Answer 3 · 2022-12-13T20:41:54.000Z

Choose container system... examples could be:

k3s (unlikely, only recommended for 100% disposable. eg: CI)
kubernetes (no.)
nomad (tempting)
docker without any bells and whistles managed using chef with systemd
podman without any bells and whistles managed using chef with systemd
others?

Answer 4 · 2022-12-13T21:10:38.000Z

Well there's actually really two separate questions - what to use to build images and what to use to deploy them.

As I understand there are other less horrible languages that Dockerfile for describing images and those images can be deployed with any of the major systems just as podman for example can deploy an image built from a Dockerfile.

Answer 5 · 2022-12-13T21:19:08.000Z

Some articles on container systems and orchestration systems.

I think https://buildah.io/ is what I was thinking of as the main alternative to Dockerfile - it can actually use Dockerfiles but the native way is just based around writing a script that uses buildah commands to manipulate the image.

Answer 6 · 2022-12-13T21:49:45.000Z

Dockerfiles are the beast I know and by a million miles has more adoption than the alternatives.

There are alternatives for building OCI compatible container images with varying levels completeness. Other than https://buildah.io/ there is also https://github.com/genuinetools/img (uses Dockerfile) and https://github.com/GoogleContainerTools/jib (java images)

Answer 7 · 2022-12-15T19:39:02.000Z

The stuff that runs on naga (redirectors, blog aggregator, munin server that we're going to demise (#501)) might be good.

Answer 8 · 2022-12-15T19:46:57.000Z

https://hardware.osm.org which runs on idris even already has a dockerfile in https://github.com/osmfoundation/osmf-server-info

Answer 9 · 2023-02-12T13:32:03.000Z

The following sites are now containers:

welcome.openstreetmap.org - openstreetmap/chef#570
stateofthemap.org - openstreetmap/chef#579
2013.stateofthemap.org - openstreetmap/chef#575
2016.stateofthemap.org - openstreetmap/chef#571
2017.stateofthemap.org - openstreetmap/chef#571
2018.stateofthemap.org - openstreetmap/chef#571
2019.stateofthemap.org - openstreetmap/chef#571
2020.stateofthemap.org - openstreetmap/chef#571
2021.stateofthemap.org - openstreetmap/chef#571
2022.stateofthemap.org - openstreetmap/chef#571
operations.osmfoundation.org - openstreetmap/chef#573
irc.openstreetmap.org - openstreetmap/chef#577
switch2osm.org - openstreetmap/chef#578
trac.openstreetmap.org - openstreetmap/chef#579
svn.openstreetmap.org - openstreetmap/chef#579
hot.openstreetmap.org
preview.ideditor.com

Answer 10 · 2023-04-12T12:33:25.000Z

You probably want to deploy https://github.com/google/cadvisor on every Docker host and point Prometheus at that,

Answer 11 · 2023-04-12T12:43:49.000Z

Well that rather depends how it works - if it depends on the docker daemon then it probably won't work for us.

Answer 12 · 2023-04-12T12:48:03.000Z

A quick test on naga suggests it doesn't manage to collect anything useful in our setup - all it finds is some basic host hardware metrics:

# HELP machine_cpu_cores Number of logical CPU cores.
# TYPE machine_cpu_cores gauge
machine_cpu_cores{boot_id="d474a5a2-46e3-45bb-bd70-8babbdf0fd25",machine_id="0787fff1c8bb4e53a49da871f67e3246",system_uuid="32353537-3835-584d-5136-303130324447"} 64 1681303543392
# HELP machine_cpu_physical_cores Number of physical CPU cores.
# TYPE machine_cpu_physical_cores gauge
machine_cpu_physical_cores{boot_id="d474a5a2-46e3-45bb-bd70-8babbdf0fd25",machine_id="0787fff1c8bb4e53a49da871f67e3246",system_uuid="32353537-3835-584d-5136-303130324447"} 16 1681303543392
# HELP machine_cpu_sockets Number of CPU sockets.
# TYPE machine_cpu_sockets gauge
machine_cpu_sockets{boot_id="d474a5a2-46e3-45bb-bd70-8babbdf0fd25",machine_id="0787fff1c8bb4e53a49da871f67e3246",system_uuid="32353537-3835-584d-5136-303130324447"} 2 1681303543392
# HELP machine_dimm_capacity_bytes Total RAM DIMM capacity (all types memory modules) value labeled by dimm type.
# TYPE machine_dimm_capacity_bytes gauge
machine_dimm_capacity_bytes{boot_id="d474a5a2-46e3-45bb-bd70-8babbdf0fd25",machine_id="0787fff1c8bb4e53a49da871f67e3246",system_uuid="32353537-3835-584d-5136-303130324447",type="Registered-DDR4"} 2.06158430208e+11 1681303543392
# HELP machine_dimm_count Number of RAM DIMM (all types memory modules) value labeled by dimm type.
# TYPE machine_dimm_count gauge
machine_dimm_count{boot_id="d474a5a2-46e3-45bb-bd70-8babbdf0fd25",machine_id="0787fff1c8bb4e53a49da871f67e3246",system_uuid="32353537-3835-584d-5136-303130324447",type="Registered-DDR4"} 12 1681303543392
# HELP machine_memory_bytes Amount of memory installed on the machine.
# TYPE machine_memory_bytes gauge
machine_memory_bytes{boot_id="d474a5a2-46e3-45bb-bd70-8babbdf0fd25",machine_id="0787fff1c8bb4e53a49da871f67e3246",system_uuid="32353537-3835-584d-5136-303130324447"} 2.02673983488e+11 1681303543392
# HELP machine_nvm_avg_power_budget_watts NVM power budget.
# TYPE machine_nvm_avg_power_budget_watts gauge
machine_nvm_avg_power_budget_watts{boot_id="d474a5a2-46e3-45bb-bd70-8babbdf0fd25",machine_id="0787fff1c8bb4e53a49da871f67e3246",system_uuid="32353537-3835-584d-5136-303130324447"} 0 1681303543392
# HELP machine_nvm_capacity NVM capacity value labeled by NVM mode (memory mode or app direct mode).
# TYPE machine_nvm_capacity gauge
machine_nvm_capacity{boot_id="d474a5a2-46e3-45bb-bd70-8babbdf0fd25",machine_id="0787fff1c8bb4e53a49da871f67e3246",mode="app_direct_mode",system_uuid="32353537-3835-584d-5136-303130324447"} 0 1681303543392
machine_nvm_capacity{boot_id="d474a5a2-46e3-45bb-bd70-8babbdf0fd25",machine_id="0787fff1c8bb4e53a49da871f67e3246",mode="memory_mode",system_uuid="32353537-3835-584d-5136-303130324447"} 0 1681303543392
# HELP machine_scrape_error 1 if there was an error while getting machine metrics, 0 otherwise.
# TYPE machine_scrape_error gauge
machine_scrape_error 0

Answer 13 · 2023-04-12T12:49:00.000Z

Also generating metrics with timestamps (rather than letting the server add them) is generally considered bad form and can cause problems with metric ingestion.

Answer 14 · 2023-04-12T12:56:27.000Z

generating metrics with timestamps (rather than letting the server add them) is generally considered bad form

Do you have a source for this? Because just searching for "timestamp" on my own Grafana brings up:

grafana_build_timestamp
node_boot_time_seconds
prometheus_config_last_reload_success_timestamp_seconds
prometheus_tsdb_lowest_timestamp_seconds
thanos_bucket_store_blocks_last_loaded_timestamp_seconds

Answer 15 · 2023-04-12T13:02:59.000Z

I mean that fact that each of those metrics has a timestamp like 1681303543392 after the metric value at the end of the line - when that is there the server will use that as the timestamp to associate with the value instead of using it's own clock to get a scrape time to associate with the value.

See https://promlabs.com/blog/2022/12/15/understanding-duplicate-samples-and-out-of-order-timestamp-errors-in-prometheus/#buggy-client-side-timestamps for some discussion of the potential issues with it.

Answer 16 · 2023-04-12T14:53:22.000Z

Ah yes - that's already discussed on cAdvisor issue #2526.

However, if you don't see one of these https://github.com/google/cadvisor/blob/master/metrics/prometheus.go#L138 metrics here, then something is misconfigured.

Answer 17 · 2023-04-12T14:58:02.000Z

Well I didn't do any configuration. I just ran the executable as there didn't seem to be any clear instructions telling me to do anything else.

As I say because we are using podman if it relies on talking to dockerd to get statistics then it's probably not going to work.

Answer 18 · 2023-08-21T12:02:47.000Z

dmca.osm.org now a container: openstreetmap/chef@4ac7cf5

Answer 19 · 2023-11-15T03:04:41.000Z

Left to containerise on Ridley: tracked in #1028

Answer 20 · 2024-09-15T03:29:44.000Z

https://github.com/openstreetmap/birthday20-website/ is a wordpress to static site I generated using wp2static.

wp2static did a good job. Some cleanup was required, but it does a reasonable job.

Linked issue: #1125

Answer 21 · 2024-09-29T05:55:56.000Z

SoTM 2007, 2008 and 2009 are now containers.