The new asyncMemoize function should memoize the original promise
Closed this issue · 4 comments
Modification Proposal
The new asyncMemoize
function introduced in v12.0.1 (#760) waits until the promise returned by the memoized function is resolved before "activating" the cache.
Relevant snippet:
result = await /** @type {function(): any} */ (fn)();
cache = true;
The problem is that if the function returned by asyncMemoize
is called again before the first function resolves (or rejects), the memoized function is executed a second time.
In my opinion, asyncMemoize
should do exactly what memoize
does when memoizing an async function: The returned function returns the promise that the memoized function returns.
I guess asyncMemoize
exists because of TypeScript? I don't use TypeScript, so I don't know what is being solved by having a dedicated function to memoize async functions instead of just using memoize
.
Expected Behavior / Situation
A function returned by asyncMemoize
should not execute the memoized function more than once. Subsequent executions of the returned function should return the value that the memoized function returned the first time it was executed.
Actual Behavior / Situation
A function returned by asyncMemoize
can execute the memoized function a second time (or more) if the returned function is executed a second time when the promise returned by the memoized function is still pending.
Please paste the results of npx webpack-cli info
here, and mention other relevant information
This is not a problem yet, but it could be, so better fix it before it becomes a problem.
Sorry, I think you're a little mistaken, look at code:
const getGlobby = asyncMemoize(async () => {
// @ts-ignore
const { globby } = await import("globby");
return globby;
});
/**
* @template T
* @param fn {(function(): any) | undefined}
* @returns {function(): Promise<T>}
*/
function asyncMemoize(fn) {
let cache = false;
/** @type {T} */
let result;
return async () => {
if (cache) {
return result;
}
result = await /** @type {function(): any} */ (fn)();
cache = true;
// Allow to clean up memory for fn
// and all dependent resources
// eslint-disable-next-line no-undefined, no-param-reassign
fn = undefined;
return result;
};
}
When I call getGlobby()
:
- We don't have
cache
, so we run memorized function (usingawait
), i.e. where we are havingconst { globby } = await import("globby");
- After primise was resolved we have
cache = globby
;, i.e. we cachedglobby
function - But because we have
return async () => { /* logic */ }
we are returningPromise
and we needawait
it
So we don't run import("globby")
multiple time, only once.
A function returned by asyncMemoize should not execute the memoized function more than once.
memoize
works the same, every call getSomething()
calls returned function and check cache, no cache - run memorized function.
Subsequent executions of the returned function should return the value that the memoized function returned the first time it was executed.
We already do it, but because we need to make an async operation we can't just return globby
, only Promise<globby>
Why not just memoize
? Because in such case we will return memorized function and every await
will run code with import(...)
and there's no point in doing this, you can just write await import("globby")
Anyway feel free to feedback
Ok, let's run the asyncMemoize
globby example step by step:
getBlobby
is assigned a memoized function:const getGlobby = asyncMemoize(async () => { // @ts-ignore const { globby } = await import("globby"); return globby; });
- On this step, the memoized function has
cache === false
andresult === undefined
, andfn
hasn't been called yet.
- On this step, the memoized function has
getBlobby
is called and returns a PromisegetBlobby
seescache === false
, so it callsfn
.- The Promise returned by
getBlobby
is not fulfilled until the promise returned byfn
is fulfilled. cache
is stillfalse
, asgetBlobby
is awating the Promise returned byfn
- If at this point
getBlobby
is called again,fn
will be called again, becausecache === false
. - When the Promise returned by the first call to
fn
resolves,cache
is set totrue
,result
is set to the resolved value and the Promise returned by the first call togetBlobby
is resolved withresult
. - When the Promise returned by second the call to
fn
resolves,cache
is set totrue
(no changes),result
is set to the new resolved value and the Promise returned by the second call togetBlobby
is resolved with the newresult
. - From this point onwards, calls to
getBlobby
don't callfn
.The returned Promise is a new Promise each time, and the resolved value is one corresponding to the second call tofn
.
Now, let's run the memoize
(not async) globby example step by step:
getBlobby
is assigned a memoized function:const getGlobby = memoize(async () => { // @ts-ignore const { globby } = await import("globby"); return globby; });
- On this step, the memoized function has
cache === false
andresult === undefined
, andfn
hasn't been called yet.
- On this step, the memoized function has
getBlobby
is called and returns a PromisegetBlobby
seescache === false
, so it callsfn
.- Because there is no
await
,getBlobby
immediately setscache = true
andresult
is assigned the Promise returned byfn
. - The Promise returned by
getBlobby
is not fulfilled until the promise returned byfn
is fulfilled, because they are the same Promise.
- If at this point
getBlobby
is called again,fn
will not be called again, becausecache === true
.- The Promise returned by this call to
getBlobby
is the same promise returned by the first call togetBlobby
, which is the same Promise thatfn
returned.
- The Promise returned by this call to
- When the Promise returned by
fn
resolves, the Promises returned by the first and second calls togetBlobby
are resolved with the same value, because these three Promises are the same. - From this point onwards, calls to
getBlobby
still return the same (resolved) Promise thatfn
returned.
I think that this should be fixed mostly because IMO asyncMemoize
is not working as intended.
Also, Node already caches modules, so maybe the memoizing part should be removed entirely and just the lazy loading part should be kept. E.g.
const getGlobby = async () => {
// @ts-ignore
const { globby } = await import("globby");
return globby;
};
The Promise returned by getBlobby is not fulfilled until the promise returned by fn is fulfilled.
cache is still false, as getBlobby is awating the Promise returned by fn
We have result = await /** @type {function(): any} */ (fn)();
, so result
is globby
and cache
is true
and on the second run we return cache
Also, Node already caches modules, so maybe the memoizing part should be removed entirely and just the lazy loading part should be kept. E.g.
Yes, Node.js already caches modules, but caching on Node.js side is not fast as we want, just try it.
I think that this should be fixed mostly because IMO asyncMemoize is not working as intended.
It works, you are saying about caching different things, that is why you are thinking it doesn't work 😄
Anyway I made benchmark:
/**
* @template T
* @param fn {(function(): any) | undefined}
* @returns {function(): T}
*/
function memoize(fn) {
let cache = false;
/** @type {T} */
let result;
return () => {
if (cache) {
return result;
}
result = /** @type {function(): any} */ (fn)();
cache = true;
// Allow to clean up memory for fn
// and all dependent resources
// eslint-disable-next-line no-undefined, no-param-reassign
fn = undefined;
return result;
};
}
/**
* @template T
* @param fn {(function(): any) | undefined}
* @returns {function(): Promise<T>}
*/
function asyncMemoize(fn) {
let cache = false;
/** @type {T} */
let result;
return async () => {
if (cache) {
return result;
}
result = await /** @type {function(): any} */ (fn)();
cache = true;
// Allow to clean up memory for fn
// and all dependent resources
// eslint-disable-next-line no-undefined, no-param-reassign
fn = undefined;
return result;
};
}
const getGlobby = memoize(async () => {
// @ts-ignore
const { globby } = await import("globby");
return globby;
});
const getGlobbyAsync = asyncMemoize(async () => {
// @ts-ignore
const { globby } = await import("globby");
return globby;
});
const { Benchmark } = require("benchmark");
const suite = new Benchmark.Suite();
function p(fn) {
return {
defer: true,
async fn(deferred) {
await fn();
deferred.resolve();
},
};
}
// Warn up
(async function warnup() {
await import("globby");
})();
// add tests
suite
.add(
'await import("globby")',
p(async () => {
await import("globby");
}),
)
.add(
"await getGlobby()",
p(async () => {
await getGlobby();
}),
)
.add(
"await getGlobbyAsync()",
p(async () => {
await getGlobbyAsync();
}),
)
// add listeners
.on("cycle", (event) => {
console.log(String(event.target));
})
.on("complete", function () {
console.log(`Fastest is ${this.filter("fastest").map("name")}`);
})
// run async
.run({ async: true });
And got:
await import("globby") x 162,695 ops/sec ±2.66% (84 runs sampled)
await getGlobby() x 5,478,279 ops/sec ±0.79% (86 runs sampled)
await getGlobbyAsync() x 4,613,087 ops/sec ±0.72% (88 runs sampled)
Fastest is await getGlobby()
So caching the original promises is faster, I will fix it
So caching the original promises is faster, I will fix it
Thanks!
So the bug won't happen now, but just to be clear, the problem with asyncMemoize
happens when the memoized function is called a second time before the Promise returned by the first call resolves.
E.g.
const promise1 = getGlobbyAsync() // No await
// cache is still false at this point
const promise2 = getGlobbyAsync() // This calls import a second time
const [globby1, globby2] = await Promise.all(promise1, promise2)
// From here onwards cache is true, so getGlobbyAsync won't call import again