opensearch-project/opensearch-build

[Proposal] Support AL2023/NodeJS18 for OpenSearch/Dashboards Releases

Closed this issue ยท 33 comments

With the announcement of Amazon Linux 2023 Preview available for testing, we want to setup some notes here for the support of AL2023 once it is fully released to public.
https://aws.amazon.com/about-aws/whats-new/2021/11/preview-amazon-linux-2022/
https://aws.amazon.com/about-aws/whats-new/2023/03/amazon-linux-2023/

  • Evaluate the compatibility of AL2023 when running OpenSearch/Dashboards
    • OpenSearch/Dashboards latest binary can run on AL2023 with the same expected behaviors on AL2
    • Working alongside plugin teams to ensure all the plugins are running as expected
    • Performance testing to ensure there is no performance regression
    • Evaluate the package/dependencies availability on AL2023 default repository
    • Evaluate the size of the image vs contents integrity
  • Evaluate AL2023 availability
    • AL2023 available on docker for both x64 and arm64
    • AL2023 available on EC2 for both x64 and arm64 for testing
  • Evaluate Build Process
    • What level of changes do we need to support AL2023 with current build process and scripts
    • Can we support both AL2 and AL2023 in the same system
    • Build tool related libs/packages/dependencies availability on AL2023
  • Evaluate deprecation plan for older OS
    • When would we deprecate AL2 while supporting AL2023
    • Will this introduce breaking changes to customer
    • Would we support older version of OpenSearch/Dashboards with new OS?

We welcome more discussion on this.

Thanks.

Hi @nknize @dblock what do you think about this change?

From our teams perspective it makes sense to upgrade the main docker image from AL2 to AL2023 starting 3.0.0. It also make sense that we use this opportunities to upgrade other OS such as Ubuntu and Rockylinux to their latest versions.

Thanks.

I really don't know enough about this @peterzhuamazon. What are pros/cons?

Now, with this issue on the horizon, we need to consider how to proceed with the changes.

This implies:

  • We need to deprecate the ci image that utilizes centos7 or al2.
  • Our ci and release docker images need to transition from centos/al2 to al2023.
  • #3351
    • The Python upgrade from 3.7 to 3.9 need to happen.
    • AL2 has python 3.7 as the default, but AL2023 already includes 3.9. Attempting to compile Python 3.7 on AL2023 would consume significant time.
    • Python 3.7 support will reach its eol in 1 month.
  • If we continue using the same ci image for 1.x / 2.x / 3.x, KNN will crash for certain users with older operating systems (older glibc than the one on al2023), as centos7 represents the best compatibility for KNN, having the oldest glibc version on the OS, that does not support Node18.
    • opendistro-for-elasticsearch/k-NN#169
    • Unless we create a new jenkinsfile specifically for 1.x, we cannot extensively modify the existing dist-build-dashboards jenkinsfile, as it is already close to the Jenkinsfile max size limit.
    • Introducing a new jenkinsfile solely for 1.x will cause issues due to many hardcoded dir names as well as structures on S3.
    • One possible approach to avoid this issue is to use centos7 for building OS and use al2023 for building OSD, although this approach lacks standardization and is hard to maintain.
  • This change need to occur after the completion of the Jenkins Upgrade.
  • This is a significant breaking change, and it seems that we want to backport it to 2.x due to the deprecation of node14.

For the Node 18 upgrade, we're targeting 2.8. This is important as Node 14 no longer will receive security updates (as of April 2023). Is there an ask to delay to solve the above?

KNN support on the native code compilation:

node: /lib64/libm.so.6: version `GLIBC_2.27' not found (required by node)
node: /lib64/libc.so.6: version `GLIBC_2.25' not found (required by node)
node: /lib64/libc.so.6: version `GLIBC_2.28' not found (required by node)
node: /lib64/libstdc++.so.6: version `CXXABI_1.3.9' not found (required by node)
node: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.20' not found (required by node)
node: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.21' not found (required by node)
# CentOS7
ldd (GNU libc) 2.17

# AL2
ldd (GNU libc) 2.26

# CentOS8/RockyLinux8
ldd (GNU libc) 2.28

# AL2023
ldd (GNU libc) 2.34

Seems like if we want to switch from CentOS7 to CentOS8/RockyLinux8, we will need to remove support of CentOS7 and AL2 from our support matrix on our website as well, since CentOS8/RockyLinux8 is the baseline of node18 with glibc 2.28.

And if we compile knn on 2.28 then the lib will crash on glibc that is lower than 2.28.

cc: @krisfreedain

@vamshin @jmazanec15 Can you please provide your inputs as well?

we will need to remove support of CentOS7 and AL2 from our support matrix on our website as well, since CentOS8/RockyLinux8 is the baseline of node18 with glibc 2.28

I would agree @peterzhuamazon - if we test on CentOS8/RockyLinux8 the compatibility matrix needs to be updated to accurately reflect that for the community

ananzh commented

Hi @peterzhuamazon we will bump the node.js to version 18.16.0

@peterzhuamazon want to clarify, you are saying that node 18 requires 2.28, correct? Then doesn't this break compatibility with AL2 and centos 7? So, regardless of k-NN libs, upgrading to Node 18 would require us to remove centos 7 and AL2 from the compatibility matrix.

If this is not the case, we could also elect to compile k-NN libs in separate docker image to maintain compatibility.

@peterzhuamazon not sure if you saw this opensearch-project/OpenSearch-Dashboards#3601 (comment)

unlike chromium, it is possible to do a custom build of nodejs on centos 7 to lower glibc dependencies (I also saw musl builds somewhere, which can be another possibility to workaround this issue). I've not looked into performance or other issues might be caused by downgrading glibc, but if OS upgrade is difficult maybe this would be easier?

Did anyone confirm node 18 with AL2? @seanneumann

We noticed some glibc related compatibility issues. Not sure if it's an isolated incident.

@peterzhuamazon not sure if you saw this opensearch-project/OpenSearch-Dashboards#3601 (comment)

unlike chromium, it is possible to do a custom build of nodejs on centos 7 to lower glibc dependencies (I also saw musl builds somewhere, which can be another possibility to workaround this issue). I've not looked into performance or other issues might be caused by downgrading glibc, but if OS upgrade is difficult maybe this would be easier?

That is likely not an option because CentOS7 is going to be out of support in 1 year, and we are also on track to move to al2023 anyway. Manually rebuild nodejs18 would add another dep on our side, which is not going to scale on the long run.

Thanks.

Did anyone confirm node 18 with AL2? @seanneumann

We noticed some glibc related compatibility issues. Not sure if it's an isolated incident.

Node 18 will not run on AL2; I tried it.

OSD 2.8 will be compatible with Node 14, 16, and 18. OSD 2.8 will also bundle Node 18 and a Node 14 for fallback; with this OSD 2.8 can run on AL2 by using the bundled Node 14.

@peterzhuamazon want to clarify, you are saying that node 18 requires 2.28, correct? Then doesn't this break compatibility with AL2 and centos 7? So, regardless of k-NN libs, upgrading to Node 18 would require us to remove centos 7 and AL2 from the compatibility matrix.

If this is not the case, we could also elect to compile k-NN libs in separate docker image to maintain compatibility.

  1. It does, al2/centos7 have older libs than 2.28.
  2. Agreed that we need to move out of centos7 and al2 at some point.

Did anyone confirm node 18 with AL2? @seanneumann
We noticed some glibc related compatibility issues. Not sure if it's an isolated incident.

Node 18 will not run on AL2; I tried it.

OSD 2.8 will be compatible with Node 14, 16, and 18. OSD 2.8 will also bundle Node 18 and a Node 14 for fallback; with this OSD 2.8 can run on AL2 by using the bundled Node 14.

If node14 out of support can we even bundle a 14 alongside 18 together?

@peterzhuamazon I hate it but being a minor release, we have to.

Here are some of the update to the situation:

AL2023 / NodeJS18 Upgrade Status on 2.8.0 Release

Background

With the release of OpenSearch Project 2.8.0 approaching on 2023/06/06, and the end-of-life for NodeJS 14 just three weeks ago on 2023/04/30, we want to review the current situation regarding the necessary upgrade to the latest LTS NodeJS 18.

This upgrade will require significant changes, and our goal is to lay out the steps for each stakeholder involved in the release, review essential items, and find a path forward.

We have a related issue opened to capture all required actions, and synced some part of the comments to this quip:
#1563

Status

Approaches

Both OS and OSD keep CentOS7/AL2 for 2.8.0 release, have 16 in build ci and 14/18 bundled in artifacts, move everything to RockyLinux8/AL2023 and NodeJS18 in later releases.

  • Pros:
    • No changes on OS/OSD base image until needed
    • No k-NN issues to support glibc 2.17, later 2.28
    • More time required to upgrade both OS and OSD, on CI and Release
    • More time on the infra pipelines changes on Jenkins
    • Less changes for customer to consume for a minor/patch version change
    • The 2.8.0 release would have less changes close to freeze date
  • Cons:
    • The compatibility matrix would need to update to RockyLinux8 and above
    • Still needs to maintain a 1.x specific release outside of 2.x/3.x means more work and temp scripts

More questions:

  • Possibilities of adding 16 for 2.8.0, then 18 for 2.9.0 forward (ans: not bundle 16 as 16 / 18 are much more similar than 14 / 18).
  • Need to check whether we bundle both 14 and 18 in the 2.8.0, with automatically switch/fallback options if host does not support 18?
  • Does 2.x still use 16 to build and main use 18 to build?
  • Later on, users must use nodejs18 and cannot use 14 anymore, more modernized changes?
  • Keep our build system building 16 so build system secure in 2.8.0 until move to 18?
  • Possible RockyLinux8 and above as new compatibility matrix?

Thanks.

ananzh commented

The tested Node.js 16 version is 16.20.0.

All docker images updated and synced:

https://build.ci.opensearch.org/job/docker-copy/502/console

Next is AMI.

These three will need to get baseOS change to rockylinux8 before adding node18.
These are the only ones do not have node18 on docker images, AMI not affected.

docker/ci/dockerfiles/current/build.centos7.opensearch-dashboards.x64.arm64.dockerfile
docker/ci/dockerfiles/current/release.centos.clients.x64.arm64.dockerfile
docker/ci/dockerfiles/current/test.centos7.performance-test.x64.arm64.dockerfile

Thanks.

Turns out we need some higher version of gcc above 4.8.x that is bundled in centos7 default repos:

gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44)
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

https://build.ci.opensearch.org/blue/organizations/jenkins/distribution-build-opensearch-dashboards/detail/distribution-build-opensearch-dashboards/6176/pipeline/130/

Rockylinux:8


gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-18)
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Will switch to devtooset8 on centos7 to match with rockylinux:8 which we will eventually switch to.

Using scl and devtoolset might be a problem because it requires running source scl_source enable devtoolset8 either through rc or through profile.

With us locking docker images to non-interactive shell this is not the best option.
Exploring manually compile gcc 8 from source.

At least confirmed that on gcc8 we can yarn bootstrap:

$ yarn osd bootstrap
yarn run v1.22.19
$ scripts/use_node scripts/osd bootstrap
 info [opensearch-dashboards] running yarn

$ scripts/use_node ./preinstall_check
[1/5] Validating package.json...
[2/5] Resolving packages...
warning Resolution field "typescript@4.0.2" is incompatible with requested version "typescript@~4.5.2"
[3/5] Fetching packages...
[4/5] Linking dependencies...
warning " > sass-loader@10.4.1" has unmet peer dependency "webpack@^4.36.0 || ^5.0.0".
warning "workspace-aggregator-758bd34b-a94e-45f0-8d7d-fd111e91517b > osd_tp_run_pipeline > @elastic/eui@1.1.1" has incorrect peer dependency "typescript@^4.0.5".
warning "@osd/ace > raw-loader@4.0.2" has unmet peer dependency "webpack@^4.0.0 || ^5.0.0".
warning "@osd/interpreter > babel-loader@8.2.4" has unmet peer dependency "webpack@>=2".
warning "@osd/interpreter > copy-webpack-plugin@6.4.1" has unmet peer dependency "webpack@^4.37.0 || ^5.0.0".
warning "@osd/interpreter > css-loader@5.2.7" has unmet peer dependency "webpack@^4.27.0 || ^5.0.0".
warning "@osd/interpreter > style-loader@1.3.0" has unmet peer dependency "webpack@^4.0.0 || ^5.0.0".
warning "@osd/interpreter > url-loader@2.3.0" has unmet peer dependency "webpack@^4.0.0".
warning "@osd/interpreter > webpack-cli@4.9.2" has unmet peer dependency "webpack@4.x.x || 5.x.x".
warning "@osd/interpreter > webpack-cli > @webpack-cli/configtest@1.1.1" has unmet peer dependency "webpack@4.x.x || 5.x.x".
warning "@osd/ui-framework > @osd/optimizer > clean-webpack-plugin@3.0.0" has unmet peer dependency "webpack@*".
warning "@osd/ui-framework > @osd/optimizer > @osd/ui-shared-deps > compression-webpack-plugin@4.0.1-rc.1" has unmet peer dependency "webpack@^4.0.0 || ^5.0.0".
warning "@osd/ui-framework > @osd/optimizer > terser-webpack-plugin@2.3.8" has unmet peer dependency "webpack@^4.0.0 || ^5.0.0".
warning "@osd/ui-framework > @osd/optimizer > file-loader@6.2.0" has unmet peer dependency "webpack@^4.0.0 || ^5.0.0".
warning "@osd/ui-framework > @osd/optimizer > postcss-loader@4.3.0" has unmet peer dependency "webpack@^4.0.0 || ^5.0.0".
warning "@osd/ui-framework > @osd/optimizer > val-loader@2.1.2" has unmet peer dependency "webpack@^4.0.0 || ^5.0.0".
warning "@osd/ui-framework > @osd/optimizer > @osd/ui-shared-deps > mini-css-extract-plugin@1.6.2" has unmet peer dependency "webpack@^4.4.0 || ^5.0.0".
warning "workspace-aggregator-758bd34b-a94e-45f0-8d7d-fd111e91517b > http-aws-es@6.0.1" has unmet peer dependency "aws-sdk@^2.138.0".
warning "workspace-aggregator-758bd34b-a94e-45f0-8d7d-fd111e91517b > http-aws-es@6.0.1" has incorrect peer dependency "elasticsearch@^15.0.0".
warning "@osd/eslint-import-resolver-opensearch-dashboards > eslint-import-resolver-webpack@0.11.1" has unmet peer dependency "webpack@>=1.11.0".
[5/5] Building fresh packages...

 succ yarn.lock analysis completed without any issues
 info [@osd/cross-platform] running [osd:bootstrap] script
 info [@osd/test-subj-selector] running [osd:bootstrap] script
 info [@osd/utility-types] running [osd:bootstrap] script
 succ [@osd/test-subj-selector] bootstrap complete
 succ [@osd/utility-types] bootstrap complete
 succ [@osd/cross-platform] bootstrap complete
 info [@osd/config-schema] running [osd:bootstrap] script
 info [@osd/std] running [osd:bootstrap] script
 succ [@osd/std] bootstrap complete
 succ [@osd/config-schema] bootstrap complete
 info [@osd/logging] running [osd:bootstrap] script
 info [@osd/utils] running [osd:bootstrap] script
 succ [@osd/logging] bootstrap complete
 succ [@osd/utils] bootstrap complete
 info [@osd/apm-config-loader] running [osd:bootstrap] script
 info [@osd/dev-utils] running [osd:bootstrap] script
 succ [@osd/apm-config-loader] bootstrap complete
 succ [@osd/dev-utils] bootstrap complete
 info [@osd/ace] running [osd:bootstrap] script
 info [@osd/analytics] running [osd:bootstrap] script
 info [@osd/config] running [osd:bootstrap] script
 info [@osd/i18n] running [osd:bootstrap] script
 succ [@osd/analytics] bootstrap complete
 info [@osd/monaco] running [osd:bootstrap] script
 succ [@osd/ace] bootstrap complete
 info [@osd/opensearch-archiver] running [osd:bootstrap] script
 succ [@osd/i18n] bootstrap complete
 info [@osd/opensearch] running [osd:bootstrap] script
 succ [@osd/config] bootstrap complete
 info [@osd/plugin-generator] running [osd:bootstrap] script
 succ [@osd/opensearch] bootstrap complete
 info [@osd/telemetry-tools] running [osd:bootstrap] script
 succ [@osd/telemetry-tools] bootstrap complete
 info [@osd/test] running [osd:bootstrap] script
 succ [@osd/test] bootstrap complete
 succ [@osd/plugin-generator] bootstrap complete
 succ [@osd/opensearch-archiver] bootstrap complete
 succ [@osd/monaco] bootstrap complete
 info [@osd/interpreter] running [osd:bootstrap] script
 info [@osd/ui-shared-deps] running [osd:bootstrap] script
 succ [@osd/interpreter] bootstrap complete
 succ [@osd/ui-shared-deps] bootstrap complete
 info [@osd/optimizer] running [osd:bootstrap] script
 succ [@osd/optimizer] bootstrap complete
 info [@osd/plugin-helpers] running [osd:bootstrap] script
 succ [@osd/plugin-helpers] bootstrap complete
 info [opensearch-dashboards] running [osd:bootstrap] script
 succ [opensearch-dashboards] bootstrap complete
Done in 277.30s.

Hi All,

We are going to stay with node16 for 2.9, and plan to move to node18 for 2.10.
This would ensure us to have enough time for the upgrade.
As well as compensate the changes for multiple teams.

We can also start the Python3.9 upgrade process as Python3.7 is about to be eol by 2023/06/23.

Once we go to node18 in 2.10, the support matrix of our product will need to remove these OSes:

  • RHEL7&CentOS7 / AmazonLinux2 (glibc2.17 / glibc2.26)
  • Ubuntu 16.04 / Ubuntu 18.04 (glibc2.23 / glibc2.27)
    (Note: we still bundle node14 as fallback so technically above can still run, but 14 is eol and we do not officially support them anymore)

This also requires that the 2.x branch stays on 16.20.0 in .nvmrc, and have no node18 specific code committed until 2.9.0 is released to the public.

Infra will take care of the node18 upgrade once after 2.9.0.

Thanks.

cc: @CEHENKLE @dblock @seanneumann @wbeckler @rednaksi91 @AMoo-Miki @ananzh @kavilla @seraphjiang @vamshin @jmazanec15 @bbarani @hdhalter @krisfreedain

The cmake upgrade to 3.23.3 version in #3706 was initially setup as a follow up in the al2023 upgrade/nodejs upgrade, tho now it seems like a 2.9.0 dependencies thus it needs to be updated soon.

Thanks.

AL2023/NodeJS18 on 2.10.0 Release:

Required Steps

Optional Steps

  • Need to discuss after the 2.10.0 release whether or not 1.x line need to upgrade as well.
    • 1.x will keep using the old centos7 docker images as I have set measures to keep it that way
    • 20230912: We will keep 2.x on CentOS7 for OpenSearch on to b/c k-NN before fully deprecating old OSes with glibc<2.27/2.28 later
  • Have separate pipeline to test NodeJS14 bc abilities.
    • No need, one pipeline is enough for that.
  • Updating

What is the status of the separate pipeline to test node 14 backward compatibility? There are going to be users using old linux versions who can't go past node 14 yet, and it would be useful to know what fixes we can do so as to not unnecessarily break things for those users.

@wbeckler We do not have plans to create a separate pipeline in public Jenkins infra for EOL versions of Node. If needed, this needs to be tackled at repo level.

Have quick talk with OSD @AMoo-Miki about core, and we are planning to switch 2.x to 18.16.0 next week to start testing on the 2.10.0 build with newer nodejs.

Thanks.

The majority of the tasks in this issue is resolved and implemented.
We will issue new tickets whenever new tasks coming up.

Thanks.