standard-things/esm

Empty import when using esm with sqreen

eole1712 opened this issue ยท 13 comments

Hi again,

Sorry about yesterday's issue, I was really exhausted from trying to isolate my problem and one wrong lead was enough for me to run into it.

I have about the same problem as yesterday, except this time I know that the bug comes from the combination of esm and sqreen.

I start by creating a file that exports anything, like a string :

text.js:

module.exports = "HELLO WORLD";

Then I import it in two different files, one with import :
a.js

import text from "./text";

export default text;

and the second one with the transpiled code form babeljs.io/repl :
b.js

"use strict";

Object.defineProperty(exports, "__esModule", {
  value: true
});

var _text = require("./text.js");

var _text2 = _interopRequireDefault(_text);

function _interopRequireDefault(obj) {
  return obj && obj.__esModule ? obj : { default: obj };
}

exports.default = _text2.default;

source

Then I import both of these files in my index.js (b.js then a.js):

index.js

import b from "./b";
import a from "./a";

console.log(a, b);

If I execute this code just with node -r esm , the output is : HELLO WORLD HELLO WORLD
But if I use node -r esm -r sqreen, the output will be HELLO WORLD {}

If I reverse the import order of the modules (a before b), it works as expected.

Like yesterday, I updated a rest with this isolated case to make it easier for you to reproduce the bug :
https://github.com/eole1712/esm-sqreen-require-bug

I know the problem can totally come from sqreen, but I thought you might have an idea where it could come from and as it's really blocking for me for the past few days I've been coming back to you.

It looks like b is being turned into the source of cycle (still digging)

Update:

The sqreen docs state it must be loaded first or odd things could happen. I'm guessing this may be one of those odd things.

@eole1712 Would you be up for creating a PR to add a note to the tips section of the README to load sqreen first?

@jdalton
Right now I have a crash the other way around too but I haven't been able to isolate the problem yet.
I'm going to investigate this weekend and will do the PR if everything is fine.

Thanks for digging

I'm going to investigate this weekend and will do the PR if everything is fine.

No worries, I'll add the note for now. I'm cc'ing in sqreen devs for visibility in case they have any additional insights or info @aviat, @Ningirsu, @pbetouin, @vdeturckheim, @zetaben.

Hey, thanks for the ping @jdalton . I am looking at this too. I have a few pointers already but nothing I can really share atm.

It seems that when using esm, method Module._load is not called anymore right? In that case it seems that APMs might have issues working properly when esm is present too.

e.g. https://github.com/newrelic/node-newrelic/blob/6b452832255f52b182082a73c03b9b1ef2f5ecbc/lib/shimmer.js#L322
https://github.com/elastic/apm-agent-nodejs/blob/e2a80c14c6bd2a748dbb3fa6b0e394c3adee1d70/lib/instrumentation/index.js#L52 (with https://github.com/opbeat/require-in-the-middle/blob/master/index.js#L24)

Hi @vdeturckheim!

Thanks for checking this out. With CJS interoperability enabled (the default) we still call those Module methods. The esm loader also has it's own versions of them which makes me think we're probably holding on to a pre-wrapped form of one of them whenever esm is loaded first and then sqreen is applied.

So, I cloned @eole1712 's repo and adder a runner.js file (to start the app withour using -rs).

// Set options as a parameter, environment variable, or rc file.
require = require("esm")(module/*, options*/);
const Module = require('module');
const load = Module._load;

Module._load = function (request, parent, isMain) {
    const filename = Module._resolveFilename(request, parent, isMain);

    const cachedModule = Module._cache[filename];
    if (cachedModule) {
        return cachedModule.exports;
    }
    return load.apply(this, arguments);
}

module.exports = require("./index.js")

(that's how I did in sqreen based on https://github.com/nodejs/node/blob/v8.0.0/lib/module.js#L432 which changed a bit since then).
This reproduces the issue:

$ node runner.js 
HELLO WORLD {}

Removing the lines corresponding to the _cache seems to fix the issue but at this point I'm not certain why yet.

@eole1712, regarding the issue that happens when sqreen is loaded first, I'm still not certain where it comes from but given the stacktrace you sent me, I should be able to dig from it :-)

side note @jdalton I'm not sure I fully got your last comment, overriding Module._load with

Module._load = function (request, parent, isMain) {
    console.log(request, parent.filename);
    return load.apply(this, arguments);
};

(added after esm is required) gives me the following log:

$ node runner.js 
./text.js /Users/vladimir/tmp/esm-sqreen-require-bug/b.js
HELLO WORLD HELLO WORLD

meaning that the import of './textfroma.js` is missed.

@eole1712 The esm loader has two ways to be enabled. One via node -r esm and the other inside a file via require("esm")(module).

When loaded via node -r esm the esm loader is more aggressive hooking into the Node module system. When used via require("esm")(module) we run through our own pipeline. This is so we can be used in production code (as a direct dependency on packages) without messing with global hooks.

Update:

Ok, I found the issue. I think it's an order of operations problem.

When sqreen/lib/instrumentation/moduleHijacker.js checks for the module cache entry it's finding it in its preloaded state and returning it as loaded which is why the exports is an empty object {}. I'll see if I can tweak things on our end before tweaking things on sqreen.

Update:

We pre-cache the initialized CJS module during the parse and validation phase of the ESM module. So when it enters the Module._load method that is hijacked it is already in the cache... In the usual Node scenario the module hasn't been created yet when calling Module._load so it's a fair assumption here. I'll see what we can do to meet the Node assumption here. I'll likely move the initialed module to our internal module cache and then expose during the appropriate phase.

@vdeturckheim I tried again yesterday, I can't isolate the problem when I import sqreen before esm. I'm still digging... ๐Ÿ•ณ

@jdalton Good news If you think you found the problem ! ๐ŸŽ‰
I still think it makes more sense to import sqreen after esm, just like newrelic for example.

@eole1712 No need to dig. I'm pretty sure I have it. I'm just working on a fix now.

Good news! I have a fix locally. I'll clean it up tonight and bump this evening or so.

Update:

Patch 11dc303; Test 4bf77c9

Awesome @jdalton . I'm thinking of how I could add esm to my tests to keep an eye on this in the future.

v3.0.19 is released ๐ŸŽ‰

@vdeturckheim

Let me know if you ever need any esm support work for sqreen.
I'm totally happy to dig in and workaround any issues you may have.

@jdalton Thanks a lot for your great work here. I really appreciate that!
I just DMed you on twitter with a follow up topic.