Code taking too much time in a recursive loop
deryost opened this issue · 12 comments
I have a case where some of my code take a lot of time to execute and I can't figure what is happening.
Since I can't share the whole code, I made a standalone page that have the same problem :
import _ from "lodash";
import { memoize } from "proxy-memoize";
import * as React from "react";
interface INode {
id: string;
parentId: string;
}
interface IReduxState {
nodes: INode[];
}
// Create fake tree of nodes (781 nodes)
const mockedNodes = ((() => {
const rootNode: INode = { id: "root", parentId: null };
const createdNodes: INode[] = [rootNode];
const execute = (currentNodeId: string, level: number) => {
const childNodes = _.times(5, (n) => ({
id: currentNodeId + "_" + n,
parentId: currentNodeId,
} as INode));
createdNodes.push(...childNodes);
if (level < 3) _.forEach(childNodes, (n) => execute(n.id, level + 1));
};
execute(rootNode.id, 0);
return createdNodes;
})());
const getState = (): IReduxState => ({ nodes: mockedNodes });
const getNodes = (state: IReduxState): INode[] => state.nodes;
const getNodeTree = memoize(({ state, nodeId }: { state: IReduxState; nodeId: string; }) => {
const nodes = getNodes(state);
return findAllChildrenIds(nodes, nodeId, [nodeId]);
});
const findAllChildrenIds = (nodes: INode[], parentNodeId: string, parentBranchIds: string[]) => {
const execute = (currentNodeId: string) => {
const childrenIds = _.chain(nodes)
.filter(({ parentId }) => parentId == currentNodeId)
.map(({ id }) => id)
.value();
_.forEach(childrenIds, (childrenId) => childrenIds.push(...execute(childrenId)));
return childrenIds;
};
return [...parentBranchIds, ...execute(parentNodeId)];
};
// Test component to call the state
const Test = () => {
console.time("timer1");
getNodeTree({ state: getState(), nodeId: "root" }); // The first time it's called it take more than 1 second to execute
console.log("allNodes", _.size(getState().nodes));
console.timeEnd("timer1");
console.time("timer2");
getNodeTree({ state: getState(), nodeId: "root" });
console.timeEnd("timer2");
return <div>Test page</div>;
};
export default Test;
The first time the getNodeTree selector is called, it take more than one second, if I remove the memoize function, it take less than 10 milliseconds.
Can someone tell me why the memoize function is taking so much time to execute ?
I have the same issue!
@Aslemammad Would you be interested in looking into this?
I can work on this soon.
I looked into this issue. The reason is taking too long is that proxy-memoize is creating a proxy for each iterate, which is completely expected to keep track of things, and that's the purpose of proxy-memoize. Unfortunately, I couldn't find a way to improve the performance of this issue.
I found a way to 2x the performance by returning caches faster, but that solution was going to cause many bugs anyway and the current caching strategy is the proper one.
If anyone has any ideas, please don't hesitate to let us know.
@Aslemammad Thanks for your investigation! That sounds reasonable.
@deryost No solution, but a workaround for this case would be to reduce accessing the node objects:
Please try this:
const findAllChildrenIds = (
nodes: INode[],
parentNodeId: string,
parentBranchIds: string[]
) => {
const childrenMap = new Map<string, Set<string>>();
const getChildrenSet = (nodeId: string) =>
(childrenMap.get(nodeId) ||
childrenMap.set(nodeId, new Set()).get(nodeId))!;
nodes.forEach((node) => {
if (node.parentId) {
const childrenSet = getChildrenSet(node.parentId);
childrenSet.add(node.id);
}
});
const idSet = new Set<string>();
const execute = (currentNodeId: string) => {
getChildrenSet(currentNodeId).forEach((id) => {
idSet.add(id);
execute(id);
});
};
execute(parentNodeId);
return [...parentBranchIds, ...idSet];
};
@dai-shi and @Aslemammad, thank you for taking the time to assist me.
I have actually found an alternative solution to my issue. In my case, I have applied a proxy-memoize wrapper around the "getNodes" selector, which effectively solves the problem. However, I would like to address the fact that our codebase involves multiple developers, and such intricacies can be challenging to comprehend and troubleshoot. Consequently, I have been gradually moving away from using "reselect" for similar reasons. While this particular scenario appears to be an isolated case, I will not be overly concerned about it at present.
I am curious if there is a way to debug the "proxy-memoize" function to identify and resolve such issues more easily? Your insights on this matter would be greatly appreciated.
@dai-shi We may be able to create a debug uilitiy (debug: true
) that logs each time something happens (like when a proxy gets created, ...) and shows where's the path to that object exactly. log: a.b.c got proxied
something like that.
@Aslemammad We need to be careful to add debug functionality without adding bytes to production build. But, I'm open for suggestions. So far, proxy-compare exposes affectedToPathList for debugging, which isn't used in proxy-memoize.
if there is a way to debug the "proxy-memoize" function to identify and resolve such issues more easily?
@deryost I also wonder what debugging information would be good. Does a.b.c god proxied
help? Maybe it does, because @Aslemammad found it out probably with it.
I'll will try the "affectedToPathList" function as soon as possible and will come back with my results.
Well I really don't know how to use the "affectedToPathList". It need a "affected" parameters and I don't know what to pass. Can I pass it myself ?
Sorry for the confusion. It's an API from proxy-compare but that can be hidden in proxy-memoize.
What I mean was we need to fix/modify proxy-memoize to make it available.
If the slowness is because of creating proxies, it's the design limitation.
Feel free to investigate more and look forward to any findings.
Closing as stale.