aws/aws-pdk

[BUG] Multiple TypeSafeApiProject in same monorepo has blocking race condition

zsstiers opened this issue · 0 comments

Describe the bug

When putting multiple instances of TypeSafeApiProject into the same monorepo there is a race condition which causes build failures. The issue seems to be due to overlapping attempts to install shared dependencies.

Expected Behavior

Multiple instances of TypeSafeApiProject should not fail to build.

Current Behavior

apiB: package manager :: pnpm
apiB: parse-openapi-spec :: tmp_dir :: /tmp/parse-openapi-spec.MG3zxUCts
apiB: installing packages :: typescript@5.0.4 @types/node@20.1.5 ts-node@10.9.1 ts-command-line-args@2.4.2 @redocly/cli@1.0.0-beta.126 reregexp@1.6.1 @faker-js/faker@8.1.0 @openapitools/openapi-generator-cli@2.6.0 lodash@4.17.21 @types/lodash@4.14.197 @apidevtools/swagger-parser@10.1.0 openapi-types@12.1.0 projen@0.73.8
apiA: Wrote to /root/.pdk/0.22.44/type-safe-api/pnpm/package.json:
apiA: {
apiA:   "name": "pnpm",
apiA:   "version": "1.0.0",
apiA:   "description": "",
apiA:   "main": "index.js",
apiA:   "scripts": {
apiA:     "test": "echo \"Error: no test specified\" && exit 1"
apiA:   },
apiA:   "keywords": [],
apiA:   "author": "",
apiA:   "license": "ISC"
apiA: }
apiB: Wrote to /root/.pdk/0.22.44/type-safe-api/pnpm/package.json:
apiB: {
apiB:   "name": "pnpm",
apiB:   "version": "1.0.0",
apiB:   "main": "index.js",
apiB:   "scripts": {
apiB:     "test": "echo \"Error: no test specified\" && exit 1"
apiB:   },
apiB:   "keywords": [],
apiB:   "author": "",
apiB:   "license": "ISC",
apiB:   "description": ""
apiB: }

... bunch of removed lines of pnpm installation progress ...

apiA: node_modules/.pnpm/core-js@3.35.1/node_modules/core-js: Running postinstall script...
apiA: node_modules/.pnpm/@nestjs+core@9.3.11_@nestjs+common@9.3.11_reflect-metadata@0.1.13_rxjs@7.8.0/node_modules/@nestjs/core: Running postinstall script...
apiA: ·[2Anode_modules/.pnpm/core-js@3.35.1/node_modules/core-js: Running postinstall script, done in 148ms
apiA: ·[1B·WARN· An error occurred while uploading /root/.pdk/0.22.44/type-safe-api/pnpm/node_modules/.pnpm/core-js@3.35.1/node_modules/core-js
apiA: ·[2Anode_modules/.pnpm/@nestjs+core@9.3.11_@nestjs+common@9.3.11_reflect-metadata@0.1.13_rxjs@7.8.0/node_modules/@nestjs/core: Running postinstall script, done in 532ms
apiA: ·[1Bnode_modules/.pnpm/@openapitools+openapi-generator-cli@2.6.0/node_modules/@openapitools/openapi-generator-cli: Running postinstall script...
apiB: ·[1A·ERR_PNPM_LINKING_FAILED· Error: ENOTEMPTY: directory not empty, rmdir '/root/.pdk/0.22.44/type-safe-api/pnpm/node_modules/.pnpm/core-js@3.35.1/node_modules/core-js'

Reproduction Steps

This does not consistently reproduce, but the trying to run a build for a workspace that looks like the following should be capable of breaking if unlucky.

import { javascript } from "projen";
import { monorepo } from "@aws/pdk";
import { Language, ModelLanguage, TypeSafeApiProject } from "@aws/pdk/type-safe-api";

const project = new monorepo.MonorepoTsProject({
  devDeps: ["@aws/pdk"],
  name: "foo",
  packageManager: javascript.NodePackageManager.PNPM,
  projenrcTs: true,
});

const apiOptions = {
  parent: project,
  model: {
    language: ModelLanguage.SMITHY,
    options: {
      smithy: {
        serviceName: {
          namespace: "fake",
          serviceName: "fake",
        }
      },
    },
  },
  infrastructure: {
    language: Language.TYPESCRIPT,
  },
};

new TypeSafeApiProject({
  ...apiOptions,
  name: "apiA",
  outdir: "packages/apiA",
});

new TypeSafeApiProject({
  ...apiOptions,
  name: "apiB",
  outdir: "packages/apiB",
});

project.synth();

Possible Solution

Based on the logs, I believe the reason for the issue is

not guarding against the potential for being run in parallel. What seems to be happening is that the the package manager, in this case pnpm, is trying to perform two installs to the same location at the same time and so can produce errors regarding unexpected states.

I think this can be fixed by better guarding against the case where installations would otherwise run in parallel.

Additional Information/Context

This issue was first observed in AWS CodeBuild. We have no reason to believe that AWS CodeBuild is related to the problem, but just that developer machines are more likely to already have the files cached such that they almost never would encounter the non-cached code path, which is where can break.

PDK version used

0.22.44

What languages are you seeing this issue on?

Typescript

Environment details (OS name and version, etc.)

aws/codebuild/standard:7.0