[Proposal] Support AL2023/NodeJS18 for OpenSearch/Dashboards Releases
Closed this issue ยท 33 comments
With the announcement of Amazon Linux 2023 Preview available for testing, we want to setup some notes here for the support of AL2023 once it is fully released to public.
https://aws.amazon.com/about-aws/whats-new/2021/11/preview-amazon-linux-2022/
https://aws.amazon.com/about-aws/whats-new/2023/03/amazon-linux-2023/
- Evaluate the compatibility of AL2023 when running OpenSearch/Dashboards
- OpenSearch/Dashboards latest binary can run on AL2023 with the same expected behaviors on AL2
- Working alongside plugin teams to ensure all the plugins are running as expected
- Performance testing to ensure there is no performance regression
- Evaluate the package/dependencies availability on AL2023 default repository
- Evaluate the size of the image vs contents integrity
- Evaluate AL2023 availability
- AL2023 available on docker for both x64 and arm64
- AL2023 available on EC2 for both x64 and arm64 for testing
- Evaluate Build Process
- What level of changes do we need to support AL2023 with current build process and scripts
- Can we support both AL2 and AL2023 in the same system
- Build tool related libs/packages/dependencies availability on AL2023
- Evaluate deprecation plan for older OS
- When would we deprecate AL2 while supporting AL2023
- Will this introduce breaking changes to customer
- Would we support older version of OpenSearch/Dashboards with new OS?
We welcome more discussion on this.
Thanks.
I really don't know enough about this @peterzhuamazon. What are pros/cons?
Now, with this issue on the horizon, we need to consider how to proceed with the changes.
This implies:
- We need to deprecate the ci image that utilizes centos7 or al2.
- Our ci and release docker images need to transition from centos/al2 to al2023.
- #3351
- The Python upgrade from 3.7 to 3.9 need to happen.
- AL2 has python 3.7 as the default, but AL2023 already includes 3.9. Attempting to compile Python 3.7 on AL2023 would consume significant time.
- Python 3.7 support will reach its eol in 1 month.
- If we continue using the same ci image for 1.x / 2.x / 3.x, KNN will crash for certain users with older operating systems (older glibc than the one on al2023), as centos7 represents the best compatibility for KNN, having the oldest glibc version on the OS, that does not support Node18.
- opendistro-for-elasticsearch/k-NN#169
- Unless we create a new jenkinsfile specifically for 1.x, we cannot extensively modify the existing dist-build-dashboards jenkinsfile, as it is already close to the Jenkinsfile max size limit.
- Introducing a new jenkinsfile solely for 1.x will cause issues due to many hardcoded dir names as well as structures on S3.
- One possible approach to avoid this issue is to use centos7 for building OS and use al2023 for building OSD, although this approach lacks standardization and is hard to maintain.
- This change need to occur after the completion of the Jenkins Upgrade.
- This is a significant breaking change, and it seems that we want to backport it to 2.x due to the deprecation of node14.
Tagging @vamshin @seanneumann @AMoo-Miki @ananzh @jmazanec15 for visibility.
For the Node 18 upgrade, we're targeting 2.8. This is important as Node 14 no longer will receive security updates (as of April 2023). Is there an ask to delay to solve the above?
KNN support on the native code compilation:
node: /lib64/libm.so.6: version `GLIBC_2.27' not found (required by node)
node: /lib64/libc.so.6: version `GLIBC_2.25' not found (required by node)
node: /lib64/libc.so.6: version `GLIBC_2.28' not found (required by node)
node: /lib64/libstdc++.so.6: version `CXXABI_1.3.9' not found (required by node)
node: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.20' not found (required by node)
node: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.21' not found (required by node)
# CentOS7
ldd (GNU libc) 2.17
# AL2
ldd (GNU libc) 2.26
# CentOS8/RockyLinux8
ldd (GNU libc) 2.28
# AL2023
ldd (GNU libc) 2.34
Seems like if we want to switch from CentOS7 to CentOS8/RockyLinux8, we will need to remove support of CentOS7 and AL2 from our support matrix on our website as well, since CentOS8/RockyLinux8 is the baseline of node18 with glibc 2.28.
And if we compile knn on 2.28 then the lib will crash on glibc that is lower than 2.28.
cc: @krisfreedain
@vamshin @jmazanec15 Can you please provide your inputs as well?
we will need to remove support of CentOS7 and AL2 from our support matrix on our website as well, since CentOS8/RockyLinux8 is the baseline of node18 with glibc 2.28
I would agree @peterzhuamazon - if we test on CentOS8/RockyLinux8 the compatibility matrix needs to be updated to accurately reflect that for the community
Hi @peterzhuamazon we will bump the node.js to version 18.16.0
@peterzhuamazon want to clarify, you are saying that node 18 requires 2.28, correct? Then doesn't this break compatibility with AL2 and centos 7? So, regardless of k-NN libs, upgrading to Node 18 would require us to remove centos 7 and AL2 from the compatibility matrix.
If this is not the case, we could also elect to compile k-NN libs in separate docker image to maintain compatibility.
@peterzhuamazon not sure if you saw this opensearch-project/OpenSearch-Dashboards#3601 (comment)
unlike chromium, it is possible to do a custom build of nodejs on centos 7 to lower glibc dependencies (I also saw musl builds somewhere, which can be another possibility to workaround this issue). I've not looked into performance or other issues might be caused by downgrading glibc, but if OS upgrade is difficult maybe this would be easier?
Did anyone confirm node 18 with AL2? @seanneumann
We noticed some glibc related compatibility issues. Not sure if it's an isolated incident.
@peterzhuamazon not sure if you saw this opensearch-project/OpenSearch-Dashboards#3601 (comment)
unlike chromium, it is possible to do a custom build of nodejs on centos 7 to lower glibc dependencies (I also saw musl builds somewhere, which can be another possibility to workaround this issue). I've not looked into performance or other issues might be caused by downgrading glibc, but if OS upgrade is difficult maybe this would be easier?
That is likely not an option because CentOS7 is going to be out of support in 1 year, and we are also on track to move to al2023 anyway. Manually rebuild nodejs18 would add another dep on our side, which is not going to scale on the long run.
Thanks.
Did anyone confirm node 18 with AL2? @seanneumann
We noticed some glibc related compatibility issues. Not sure if it's an isolated incident.
Node 18 will not run on AL2; I tried it.
OSD 2.8 will be compatible with Node 14, 16, and 18. OSD 2.8 will also bundle Node 18 and a Node 14 for fallback; with this OSD 2.8 can run on AL2 by using the bundled Node 14.
@peterzhuamazon want to clarify, you are saying that node 18 requires 2.28, correct? Then doesn't this break compatibility with AL2 and centos 7? So, regardless of k-NN libs, upgrading to Node 18 would require us to remove centos 7 and AL2 from the compatibility matrix.
If this is not the case, we could also elect to compile k-NN libs in separate docker image to maintain compatibility.
- It does, al2/centos7 have older libs than 2.28.
- Agreed that we need to move out of centos7 and al2 at some point.
Did anyone confirm node 18 with AL2? @seanneumann
We noticed some glibc related compatibility issues. Not sure if it's an isolated incident.Node 18 will not run on AL2; I tried it.
OSD 2.8 will be compatible with Node 14, 16, and 18. OSD 2.8 will also bundle Node 18 and a Node 14 for fallback; with this OSD 2.8 can run on AL2 by using the bundled Node 14.
If node14 out of support can we even bundle a 14 alongside 18 together?
@peterzhuamazon I hate it but being a minor release, we have to.
Here are some of the update to the situation:
AL2023 / NodeJS18 Upgrade Status on 2.8.0 Release
Background
With the release of OpenSearch Project 2.8.0 approaching on 2023/06/06, and the end-of-life for NodeJS 14 just three weeks ago on 2023/04/30, we want to review the current situation regarding the necessary upgrade to the latest LTS NodeJS 18.
This upgrade will require significant changes, and our goal is to lay out the steps for each stakeholder involved in the release, review essential items, and find a path forward.
We have a related issue opened to capture all required actions, and synced some part of the comments to this quip:
#1563
Status
-
OSD Team: @AMoo-Miki @ananzh
- NodeJS 18 Annoucement:
- NodeJS 18 PR raised: opensearch-project/OpenSearch-Dashboards#4071
- NodeJS 14 BC PR raised: https://github.com/AMoo-Miki/OpenSearch-Dashboards/tree/n18wp4-multi-node
- Plan to push the change to main/3.x, then backport to 2.x, would impact upcoming 2.8.0.
- Plan to support NodeJS 14/16/18 on above branches
- Plan to use 18.16.0 for the upcoming release
- Need to start a campaign on all OSD plugin owners to upgrade to 18?
- NodeJS 18 requires host to have glibc 2.27 version (presumably 2.28 per official NodeJS):
- NodeJS 18 revert to building on CentOS 7, RHEL 7, Ubuntu Bionic 18.04, other other LTS distros nodejs/node#43246
- Only unofficial builds have NodeJS 18 supports older glibc versions
-
EE Team: @peterzhuamazon
- CI Images:
- Currently both OS and OSD are using CentOS7 docker images to build and assemble into final artifacts
- CentOS7 has glibc 2.17 which does not support NodeJS 18
- We need to move from CentOS7 to RockyLinux8 (Not CentOS8 as CentOS have changed their release from downstream to upstream on dev focused releases)
- If possible, we can go ahead and move to AL2023 in case glibc requirement changes again
- This also means we need to upgrade opensearch-buildrepository from Python 3.7 to Python 3.9, as AL2023 defaults to use Python 3.9, while RL8 defaults to 3.6, just like CentOS7.
- Migrate build repo code from using Python 3.7 to 3.9 version #3351
- Python 3.7 support will reach its eol in 1 month (2023/06/27): https://endoflife.date/python
- New docker files need to be written and tested, and new images need to be built
- EE uses AL2 AMI as docker host. Although docker containerization means host should not affect containers behavior, we still need to verify this.
- If needed, new AMI configurations needs to be setup, new AMI needs to be built, and Jenkins needs to be redeployed with all changes
- Windows NodeJS 18 support on AMI
- Currently both OS and OSD are using CentOS7 docker images to build and assemble into final artifacts
- Release Images:
- Currently both OS and OSD are using AL2 base images to encapsulate release artifacts for docker
- AL2 has glibc 2.26 which does not support NodeJS 18
- We need to move from AL2 to AL2023 which has glibc 2.34
- Need to test the docker images properly so it resembles the exact same behavior before upgrade
- Currently both OS and OSD are using AL2 base images to encapsulate release artifacts for docker
- Release pipeline:
- Infra needs to have specific setup just to support 1.x branch OS / OSD building process
- Initially all 3 lines of artifacts are built on the same docker ci images, now we might need to maintain more than one image for different lines
- CI Images:
-
k-NN Team: @jmazanec15
- Since ODFE, we specifically build k-NN on CentOS7 due to it has the oldest glibc version 2.17.
- This ensures that the k-NN lib will not crash on any glibc version above 2.17.
- If we build k-NN nmslib on RockyLinux8, which has glibc 2.28, then k-NN nmslib will crash on CentOS7/AL2 with older versions of the glibc.
- This also means we need to remove support of CentOS7/AL2 from the compatibility matrix and add RockyLinux 8 / AL2023.
Approaches
Both OS and OSD keep CentOS7/AL2 for 2.8.0 release, have 16 in build ci and 14/18 bundled in artifacts, move everything to RockyLinux8/AL2023 and NodeJS18 in later releases.
- Pros:
- No changes on OS/OSD base image until needed
- No k-NN issues to support glibc 2.17, later 2.28
- More time required to upgrade both OS and OSD, on CI and Release
- More time on the infra pipelines changes on Jenkins
- Less changes for customer to consume for a minor/patch version change
- The 2.8.0 release would have less changes close to freeze date
- Cons:
- The compatibility matrix would need to update to RockyLinux8 and above
- Still needs to maintain a 1.x specific release outside of 2.x/3.x means more work and temp scripts
More questions:
- Possibilities of adding 16 for 2.8.0, then 18 for 2.9.0 forward (ans: not bundle 16 as 16 / 18 are much more similar than 14 / 18).
- Need to check whether we bundle both 14 and 18 in the 2.8.0, with automatically switch/fallback options if host does not support 18?
- Does 2.x still use 16 to build and main use 18 to build?
- Later on, users must use nodejs18 and cannot use 14 anymore, more modernized changes?
- Keep our build system building 16 so build system secure in 2.8.0 until move to 18?
- Possible RockyLinux8 and above as new compatibility matrix?
Thanks.
The tested Node.js 16 version is 16.20.0
.
PRs support NodeJS16:
- #3552
- #3560
- #3558
- opensearch-project/opensearch-ci#291
- opensearch-project/opensearch-ci#292
- opensearch-project/opensearch-ci#293
- opensearch-project/opensearch-ci#295
- opensearch-project/opensearch-ci#296
- opensearch-project/opensearch-ci#298
- opensearch-project/opensearch-ci#299
- opensearch-project/opensearch-ci#300
- opensearch-project/opensearch-ci#301
- opensearch-project/opensearch-ci#306
- opensearch-project/opensearch-ci#317
- opensearch-project/opensearch-ci#319
- opensearch-project/opensearch-ci#320
- #3711
- #3714
All docker images updated and synced:
https://build.ci.opensearch.org/job/docker-copy/502/console
Next is AMI.
These three will need to get baseOS change to rockylinux8 before adding node18.
These are the only ones do not have node18 on docker images, AMI not affected.
docker/ci/dockerfiles/current/build.centos7.opensearch-dashboards.x64.arm64.dockerfile
docker/ci/dockerfiles/current/release.centos.clients.x64.arm64.dockerfile
docker/ci/dockerfiles/current/test.centos7.performance-test.x64.arm64.dockerfile
Thanks.
New Bug:
Turns out we need some higher version of gcc above 4.8.x that is bundled in centos7 default repos:
gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44)
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Rockylinux:8
gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-18)
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Will switch to devtooset8 on centos7 to match with rockylinux:8 which we will eventually switch to.
Using scl and devtoolset might be a problem because it requires running source scl_source enable devtoolset8
either through rc or through profile.
With us locking docker images to non-interactive shell this is not the best option.
Exploring manually compile gcc 8 from source.
At least confirmed that on gcc8 we can yarn bootstrap:
$ yarn osd bootstrap
yarn run v1.22.19
$ scripts/use_node scripts/osd bootstrap
info [opensearch-dashboards] running yarn
$ scripts/use_node ./preinstall_check
[1/5] Validating package.json...
[2/5] Resolving packages...
warning Resolution field "typescript@4.0.2" is incompatible with requested version "typescript@~4.5.2"
[3/5] Fetching packages...
[4/5] Linking dependencies...
warning " > sass-loader@10.4.1" has unmet peer dependency "webpack@^4.36.0 || ^5.0.0".
warning "workspace-aggregator-758bd34b-a94e-45f0-8d7d-fd111e91517b > osd_tp_run_pipeline > @elastic/eui@1.1.1" has incorrect peer dependency "typescript@^4.0.5".
warning "@osd/ace > raw-loader@4.0.2" has unmet peer dependency "webpack@^4.0.0 || ^5.0.0".
warning "@osd/interpreter > babel-loader@8.2.4" has unmet peer dependency "webpack@>=2".
warning "@osd/interpreter > copy-webpack-plugin@6.4.1" has unmet peer dependency "webpack@^4.37.0 || ^5.0.0".
warning "@osd/interpreter > css-loader@5.2.7" has unmet peer dependency "webpack@^4.27.0 || ^5.0.0".
warning "@osd/interpreter > style-loader@1.3.0" has unmet peer dependency "webpack@^4.0.0 || ^5.0.0".
warning "@osd/interpreter > url-loader@2.3.0" has unmet peer dependency "webpack@^4.0.0".
warning "@osd/interpreter > webpack-cli@4.9.2" has unmet peer dependency "webpack@4.x.x || 5.x.x".
warning "@osd/interpreter > webpack-cli > @webpack-cli/configtest@1.1.1" has unmet peer dependency "webpack@4.x.x || 5.x.x".
warning "@osd/ui-framework > @osd/optimizer > clean-webpack-plugin@3.0.0" has unmet peer dependency "webpack@*".
warning "@osd/ui-framework > @osd/optimizer > @osd/ui-shared-deps > compression-webpack-plugin@4.0.1-rc.1" has unmet peer dependency "webpack@^4.0.0 || ^5.0.0".
warning "@osd/ui-framework > @osd/optimizer > terser-webpack-plugin@2.3.8" has unmet peer dependency "webpack@^4.0.0 || ^5.0.0".
warning "@osd/ui-framework > @osd/optimizer > file-loader@6.2.0" has unmet peer dependency "webpack@^4.0.0 || ^5.0.0".
warning "@osd/ui-framework > @osd/optimizer > postcss-loader@4.3.0" has unmet peer dependency "webpack@^4.0.0 || ^5.0.0".
warning "@osd/ui-framework > @osd/optimizer > val-loader@2.1.2" has unmet peer dependency "webpack@^4.0.0 || ^5.0.0".
warning "@osd/ui-framework > @osd/optimizer > @osd/ui-shared-deps > mini-css-extract-plugin@1.6.2" has unmet peer dependency "webpack@^4.4.0 || ^5.0.0".
warning "workspace-aggregator-758bd34b-a94e-45f0-8d7d-fd111e91517b > http-aws-es@6.0.1" has unmet peer dependency "aws-sdk@^2.138.0".
warning "workspace-aggregator-758bd34b-a94e-45f0-8d7d-fd111e91517b > http-aws-es@6.0.1" has incorrect peer dependency "elasticsearch@^15.0.0".
warning "@osd/eslint-import-resolver-opensearch-dashboards > eslint-import-resolver-webpack@0.11.1" has unmet peer dependency "webpack@>=1.11.0".
[5/5] Building fresh packages...
succ yarn.lock analysis completed without any issues
info [@osd/cross-platform] running [osd:bootstrap] script
info [@osd/test-subj-selector] running [osd:bootstrap] script
info [@osd/utility-types] running [osd:bootstrap] script
succ [@osd/test-subj-selector] bootstrap complete
succ [@osd/utility-types] bootstrap complete
succ [@osd/cross-platform] bootstrap complete
info [@osd/config-schema] running [osd:bootstrap] script
info [@osd/std] running [osd:bootstrap] script
succ [@osd/std] bootstrap complete
succ [@osd/config-schema] bootstrap complete
info [@osd/logging] running [osd:bootstrap] script
info [@osd/utils] running [osd:bootstrap] script
succ [@osd/logging] bootstrap complete
succ [@osd/utils] bootstrap complete
info [@osd/apm-config-loader] running [osd:bootstrap] script
info [@osd/dev-utils] running [osd:bootstrap] script
succ [@osd/apm-config-loader] bootstrap complete
succ [@osd/dev-utils] bootstrap complete
info [@osd/ace] running [osd:bootstrap] script
info [@osd/analytics] running [osd:bootstrap] script
info [@osd/config] running [osd:bootstrap] script
info [@osd/i18n] running [osd:bootstrap] script
succ [@osd/analytics] bootstrap complete
info [@osd/monaco] running [osd:bootstrap] script
succ [@osd/ace] bootstrap complete
info [@osd/opensearch-archiver] running [osd:bootstrap] script
succ [@osd/i18n] bootstrap complete
info [@osd/opensearch] running [osd:bootstrap] script
succ [@osd/config] bootstrap complete
info [@osd/plugin-generator] running [osd:bootstrap] script
succ [@osd/opensearch] bootstrap complete
info [@osd/telemetry-tools] running [osd:bootstrap] script
succ [@osd/telemetry-tools] bootstrap complete
info [@osd/test] running [osd:bootstrap] script
succ [@osd/test] bootstrap complete
succ [@osd/plugin-generator] bootstrap complete
succ [@osd/opensearch-archiver] bootstrap complete
succ [@osd/monaco] bootstrap complete
info [@osd/interpreter] running [osd:bootstrap] script
info [@osd/ui-shared-deps] running [osd:bootstrap] script
succ [@osd/interpreter] bootstrap complete
succ [@osd/ui-shared-deps] bootstrap complete
info [@osd/optimizer] running [osd:bootstrap] script
succ [@osd/optimizer] bootstrap complete
info [@osd/plugin-helpers] running [osd:bootstrap] script
succ [@osd/plugin-helpers] bootstrap complete
info [opensearch-dashboards] running [osd:bootstrap] script
succ [opensearch-dashboards] bootstrap complete
Done in 277.30s.
Hi All,
We are going to stay with node16 for 2.9, and plan to move to node18 for 2.10.
This would ensure us to have enough time for the upgrade.
As well as compensate the changes for multiple teams.
We can also start the Python3.9 upgrade process as Python3.7 is about to be eol by 2023/06/23.
Once we go to node18 in 2.10, the support matrix of our product will need to remove these OSes:
- RHEL7&CentOS7 / AmazonLinux2 (glibc2.17 / glibc2.26)
- Ubuntu 16.04 / Ubuntu 18.04 (glibc2.23 / glibc2.27)
(Note: we still bundle node14 as fallback so technically above can still run, but 14 is eol and we do not officially support them anymore)
This also requires that the 2.x branch stays on 16.20.0 in .nvmrc, and have no node18 specific code committed until 2.9.0 is released to the public.
Infra will take care of the node18 upgrade once after 2.9.0.
Thanks.
cc: @CEHENKLE @dblock @seanneumann @wbeckler @rednaksi91 @AMoo-Miki @ananzh @kavilla @seraphjiang @vamshin @jmazanec15 @bbarani @hdhalter @krisfreedain
The cmake upgrade to 3.23.3 version in #3706 was initially setup as a follow up in the al2023 upgrade/nodejs upgrade, tho now it seems like a 2.9.0 dependencies thus it needs to be updated soon.
Thanks.
AL2023/NodeJS18 on 2.10.0 Release:
Required Steps
- Update all AMIs to have NodeJS14/16/18 support โ
- This can hold on a bit as the MacOS build is not on OSD and we are converting Windows to containers:
- opensearch-project/opensearch-ci#281
- Update all CI docker images to have NodeJS14/16/18 support โ
- Update all Release docker images to use NodeJS18 as the base node version for OSD โ
- Switch OSD core to have 18.16.0 version โ
- Deprecate all CentOS7 images in build systems and replace with the RockyLinux8 images (Except client release image as it does not require nodejs18 by default for now) โ
- Work with OSD team members to ensure all OSD core and plugins are locking to NodeJS 18.16.0 at least on 2.x/2.10 branches. 1.x can pending.
- To my understanding all scripts are using the nvmrc file in OSD for nodejs version in Jenkins build.
- Since KNN already update the dependencies such as gcc to v7 and cmake to 3.23.3, it is easier for us to bring them onto either Rockylinux8 or AL2023. Starting from now k-NN will not be able to run on OS with glibc version lower than 2.28 (previously 2.17)
- Work with doc team to update the doc compatibility charts to remove CentOS7 support and add AL2023 โ
- opensearch-project/documentation-website#4303
- 20230912: We have decided to restore the b/c of k-NN on 2.10.0 and will revisit the deprecation of certain OS later
- Test all the pipelines on the new servers.
Optional Steps
- Need to discuss after the 2.10.0 release whether or not 1.x line need to upgrade as well.
- 1.x will keep using the old centos7 docker images as I have set measures to keep it that way
- 20230912: We will keep 2.x on CentOS7 for OpenSearch on to b/c k-NN before fully deprecating old OSes with glibc<2.27/2.28 later
- Have separate pipeline to test NodeJS14 bc abilities.
- No need, one pipeline is enough for that.
- Updating
What is the status of the separate pipeline to test node 14 backward compatibility? There are going to be users using old linux versions who can't go past node 14 yet, and it would be useful to know what fixes we can do so as to not unnecessarily break things for those users.
@wbeckler We do not have plans to create a separate pipeline in public Jenkins infra for EOL versions of Node. If needed, this needs to be tackled at repo level.
Have quick talk with OSD @AMoo-Miki about core, and we are planning to switch 2.x to 18.16.0 next week to start testing on the 2.10.0 build with newer nodejs.
Thanks.
The majority of the tasks in this issue is resolved and implemented.
We will issue new tickets whenever new tasks coming up.
Thanks.