apache/pulsar-client-node

Advice for how to install this package in a Docker container

ericallam opened this issue · 7 comments

I'm using Pulsar in trigger.dev and I'm trying to deploy a Node.js web service that has a dependency on this package in a Docker container. I'm running into issues trying to get this dependency to work and I was wondering if there is any guidance on the best way to build a Docker container with this package.

Here is the Dockerfile for reference, which does build but is failing to work once I try running it:

FROM node:lts-bullseye-slim AS pruner
RUN apt-get update && apt-get install openssl -y
WORKDIR /app
RUN npm install turbo -g
COPY . .
RUN turbo prune --scope=webapp --docker
RUN find . -name "node_modules" -type d -prune -exec rm -rf '{}' +

# Base strategy to have layer caching
FROM node:lts-bullseye-slim AS base
RUN apt-get update && apt-get install openssl g++ make wget python3 -y
ENV PULSAR_CPP_CLIENT_VERSION=2.10.3
RUN wget https://archive.apache.org/dist/pulsar/pulsar-${PULSAR_CPP_CLIENT_VERSION}/DEB/apache-pulsar-client.deb -q
RUN wget https://archive.apache.org/dist/pulsar/pulsar-${PULSAR_CPP_CLIENT_VERSION}/DEB/apache-pulsar-client-dev.deb -q
RUN dpkg -i --force-architecture ./apache-pulsar-client*.deb
WORKDIR /app
COPY .gitignore .gitignore
COPY --from=pruner /app/out/json/ .
COPY --from=pruner /app/out/pnpm-lock.yaml ./pnpm-lock.yaml
COPY --from=pruner /app/out/pnpm-workspace.yaml ./pnpm-workspace.yaml

FROM base AS production-deps
WORKDIR /app
RUN npm install turbo -g
RUN corepack enable
ENV NODE_ENV production
RUN npm config set python /usr/bin/python3
COPY --from=pruner /app/out/full/apps/webapp/prisma/schema.prisma /app/apps/webapp/prisma/schema.prisma
RUN pnpm install --prod --frozen-lockfile
RUN pnpx prisma generate --schema /app/apps/webapp/prisma/schema.prisma

FROM base AS builder
WORKDIR /app
RUN npm install turbo -g
COPY turbo.json turbo.json
RUN corepack enable
COPY --from=pruner /app/out/full/ .
RUN npm config set python /usr/bin/python3
ENV NODE_ENV development
RUN pnpm install --ignore-scripts --frozen-lockfile
RUN pnpm run generate
RUN pnpm run build --filter=webapp...

# Runner
FROM node:lts-bullseye-slim AS runner
RUN apt-get update && apt-get install openssl g++ make wget python3 -y
ENV PULSAR_CPP_CLIENT_VERSION=2.10.3
RUN wget https://archive.apache.org/dist/pulsar/pulsar-${PULSAR_CPP_CLIENT_VERSION}/DEB/apache-pulsar-client.deb -q
RUN wget https://archive.apache.org/dist/pulsar/pulsar-${PULSAR_CPP_CLIENT_VERSION}/DEB/apache-pulsar-client-dev.deb -q
RUN dpkg -i --force-architecture ./apache-pulsar-client*.deb
RUN npm install turbo -g
WORKDIR /app
RUN corepack enable
ENV NODE_ENV production
RUN addgroup --system --gid 1001 nodejs
RUN adduser --system --uid 1001 remixjs
RUN chown -R remixjs:nodejs /app
USER remixjs

COPY --from=pruner --chown=remixjs:nodejs /app/out/full/ .
COPY --from=production-deps --chown=remixjs:nodejs /app .
COPY --from=builder --chown=remixjs:nodejs /app/apps/webapp/app/styles/tailwind.css ./apps/webapp/app/styles/tailwind.css
COPY --from=builder --chown=remixjs:nodejs /app/apps/webapp/build/server.js ./apps/webapp/build/server.js
COPY --from=builder --chown=remixjs:nodejs /app/apps/webapp/build ./apps/webapp/build
COPY --from=builder --chown=remixjs:nodejs /app/apps/webapp/public ./apps/webapp/public
COPY --from=builder --chown=remixjs:nodejs /app/apps/webapp/prisma/schema.prisma ./apps/webapp/build/schema.prisma
COPY --from=builder --chown=remixjs:nodejs /app/apps/webapp/prisma/migrations ./apps/webapp/build/migrations
COPY --from=builder --chown=remixjs:nodejs /app/apps/webapp/node_modules/.prisma/client/libquery_engine-debian-openssl-1.1.x.so.node ./apps/webapp/build/libquery_engine-debian-openssl-1.1.x.so.node

# release_command = "pnpx prisma migrate deploy --schema apps/webapp/prisma/schema.prisma"
ENTRYPOINT ["pnpm", "--filter", "webapp", "run", "start"]

I'm not really getting any helpful logs (even though I've turned on logging all of Pulsar's logs):

7:59:22 AM: 📡 Connected to pulsar at pulsar+ssl://cluster1.o-syacq.snio.cloud:6651
7:59:22 AM: [07:59:22.675] [trigger.dev publisher]  Initializing publisher with config {"topic":"persistent://triggerdotdev/workflows/run-command-responses"}
7:59:22 AM: undefined
7:59:22 AM: /app/apps/webapp:
7:59:22 AM:  ERR_PNPM_RECURSIVE_RUN_FIRST_FAIL  webapp@1.0.0 start: `cross-env NODE_ENV=production node ./build/server.js`
7:59:22 AM: Exit status 1

It looks like possibly a segfault but will keep investigating.

I just confirmed that i am getting a segfault (pulled the image and in a run the following code in a node repl)

const Pulsar = require("pulsar-client");
Pulsar.Client.setLogHandler((level, file, line, message) => { console.log("[%s][%s:%d] %s", level, file, line, message); });
var authentication = new Pulsar.AuthenticationOauth2({
  type: "sn_service_account",
  client_id: process.env.PULSAR_CLIENT_ID,
  client_secret: process.env.PULSAR_CLIENT_SECRET,
  issuer_url: process.env.PULSAR_ISSUER_URL,
  audience: process.env.PULSAR_AUDIENCE,
});

var client = new Pulsar.Client({
  serviceUrl: process.env.PULSAR_SERVICE_URL,
  authentication: authentication,
});

client.createProducer({ topic: "persistent://public/default/inside-docker" }).then((producer) => { console.log("created producer"); }).catch(console.error);

And got the following error (no other output):

Segmentation fault (core dumped)

So I'm obviously doing something wrong with the installation procedure but not having any luck. This is happening with 1.7.0 and 1.8.0

shibd commented

Hi, @ericallam You can add flow code and share Segmentation fault info:

const SegfaultHandler = require('segfault-handler');

SegfaultHandler.registerHandler('crash.log');

Here's the segment log:

8:25:40 PM: Connecting to pulsar instance at pulsar+ssl://<cluster>.<orgId>.snio.cloud:6651...
8:25:40 PM: 📡 Connected to pulsar at pulsar+ssl://<cluster>.<orgId>.snio.cloud:6651
8:25:40 PM: [20:25:40.931] [trigger.dev publisher]  Initializing publisher with config {"topic":"persistent://triggerdotdev/workflows/run-command-responses"}
8:25:40 PM: PID 55 received SIGSEGV for address: 0x0
8:25:40 PM: /app/node_modules/.pnpm/segfault-handler@1.3.0/node_modules/segfault-handler/build/Release/segfault-handler.node(+0x372d)[0x7f06215af72d]
8:25:40 PM: /lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x7f0628a4b520]
8:25:40 PM: /lib/libpulsar.so.2.10.3(SSL_get_peer_certificate+0x12)[0x7f061b8311e2]
8:25:40 PM: /lib/libpulsar.so.2.10.3(+0x6137af)[0x7f061b8137af]
8:25:40 PM: /lib/libpulsar.so.2.10.3(+0x61545b)[0x7f061b81545b]
8:25:40 PM: /lib/libpulsar.so.2.10.3(+0x5e96de)[0x7f061b7e96de]
8:25:40 PM: /lib/libpulsar.so.2.10.3(+0x5eebe2)[0x7f061b7eebe2]
8:25:40 PM: /lib/libpulsar.so.2.10.3(+0x5d8c8e)[0x7f061b7d8c8e]
8:25:40 PM: /lib/libpulsar.so.2.10.3(curl_multi_perform+0x93)[0x7f061b7d9c93]
8:25:40 PM: /lib/libpulsar.so.2.10.3(curl_easy_perform+0x107)[0x7f061b7d3dd7]
8:25:40 PM: /lib/libpulsar.so.2.10.3(+0x45e97f)[0x7f061b65e97f]
8:25:40 PM: /lib/x86_64-linux-gnu/libc.so.6(+0x99f68)[0x7f0628aa2f68]
8:25:40 PM: /lib/libpulsar.so.2.10.3(+0x45f24e)[0x7f061b65f24e]
8:25:40 PM: /lib/libpulsar.so.2.10.3(_ZN6pulsar10AuthOauth211getAuthDataERSt10shared_ptrINS_26AuthenticationDataProviderEE+0x33)[0x7f061b65d423]
8:25:40 PM: /lib/libpulsar.so.2.10.3(_ZN6pulsar16ClientConnectionC2ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES8_St10shared_ptrINS_15ExecutorServiceEERKNS_19ClientConfigurationERKS9_INS_14AuthenticationEE+0x11c7)[0x7f061b54c9d7]
8:25:40 PM: /lib/libpulsar.so.2.10.3(_ZN6pulsar14ConnectionPool18getConnectionAsyncERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES8_+0x94b)[0x7f061b59a24b]
8:25:40 PM: /lib/libpulsar.so.2.10.3(_ZN6pulsar24BinaryProtoLookupService25getPartitionMetadataAsyncERKSt10shared_ptrINS_9TopicNameEE+0x11e)[0x7f061b52f2fe]
8:25:40 PM: /lib/libpulsar.so.2.10.3(+0x378068)[0x7f061b578068]
8:25:40 PM: /lib/libpulsar.so.2.10.3(_ZN6pulsar6Client19createProducerAsyncERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEENS_21ProducerConfigurationESt8functionIFvNS_6ResultENS_8ProducerEEE+0x45)[0x7f061b5380d5]
8:25:40 PM: /lib/libpulsar.so.2.10.3(pulsar_client_create_producer_async+0xcf)[0x7f061b65154f]
8:25:40 PM: /app/node_modules/.pnpm/pulsar-client@1.7.0/node_modules/pulsar-client/build/Release/Pulsar.node(_ZN8Producer11NewInstanceERKN4Napi12CallbackInfoESt10shared_ptrI14_pulsar_clientE+0x3c7)[0x7f062132a647]
8:25:40 PM: /app/node_modules/.pnpm/pulsar-client@1.7.0/node_modules/pulsar-client/build/Release/Pulsar.node(_ZN6Client14CreateProducerERKN4Napi12CallbackInfoE+0x50)[0x7f0621323090]
8:25:40 PM: /app/node_modules/.pnpm/pulsar-client@1.7.0/node_modules/pulsar-client/build/Release/Pulsar.node(_ZN4Napi12InstanceWrapI6ClientE29InstanceMethodCallbackWrapperEP10napi_env__P20napi_callback_info__+0x133)[0x7f0621325fe3]
8:25:40 PM: node[0xb1499d]
8:25:40 PM: node[0xda5fa0]
8:25:40 PM: node(_ZN2v88internal21Builtin_HandleApiCallEiPmPNS0_7IsolateE+0xaf)[0xda74df]
8:25:40 PM: node[0x16e9af9]
8:25:41 PM: undefined
8:25:41 PM: /app/apps/webapp:
8:25:41 PM:  ERR_PNPM_RECURSIVE_RUN_FIRST_FAIL  webapp@1.0.0 start: `cross-env NODE_ENV=production node --max-old-space-size=8192 ./build/server.js`
8:25:41 PM: Exit status 1

Interestingly this looks related to another issue I was running into with deploying this project to AWS ECS in a Docker container, but with the prisma library:

prisma/prisma#10649 (comment)

I actually managed to fix that issue by using sitespeedio/node:ubuntu-22.04-nodejs-18.12.1 as my docker container base (changed from node:lts-bullseye-slim).

It seems there is an issue with Node.js 18+ because it bundles OpenSSL 3.0 and prisma was using the system OpenSSL, which is OpenSSL 1.1.x on node:lts-bullseye-slim but is OpenSSL 3.0 on Ubuntu 22.04.

Not sure if it'll help but here is the issue where nodejs discusses replacing OpenSSL 1.1 with 3.0: nodejs/node#40106

shibd commented

This issue still exists. It seems to have happened on node:17, node:18, and node:19 of Linux.

Use node:16 can work.

Not sure if it'll help but here is the issue where nodejs discusses replacing OpenSSL 1.1 with 3.0: nodejs/node#40106

It may indeed have something to do with this, and I will try to find the reason.

shibd commented

The root cause is the OpenSSL symbol conflict between Node.js and CPP client.

#310 will be fixed it, we will release v1.8.2 as soon as possible.

For now, If need, You can temporarily use my release version: npm i shibaodi-pulsar-client@1.8.2-rc.1