dai-shi/proxy-memoize

React Redux example doesn't seem to make sense in a real-world context

lostfictions opened this issue · 19 comments

Hi there, I'm trying to understand this example:

import { useSelector } from 'react-redux';

const getScore = memoize(state => ({
  score: heavyComputation(state.a + state.b),
  createdAt: Date.now(),
}));

const Component = ({ id }) => {
  const { score, title } = useSelector(useCallback(memoize(state => ({
    score: getScore(state),
    title: state.titles[id],
  })), [id]));
  return <div>{score.score} {score.createdAt} {title}</div>;
};

I realize the example is more about how this library can interoperate with React Redux than about giving a practical scenario, but it doesn't really help me to understand how you would use proxy-memoize to create create a memoized calculation based on both the Redux store's state and the component's props. The way it's written above, useCallback will simply tear down and rebuild the memoized function whenever id changes, so there will be never any opportunity to hit the cache in this scenario.

A simpler and more efficient way of formulating the above is this:

const Component = ({ id }) => {
  const score = useSelector(getScore);
  const title = useSelector(state => state.titles[id]);
  return <div>{score.score} {score.createdAt} {title}</div>;
};

If you were trying to actually calculate expensive derived data based on both state and props with this library, how are you supposed you do it?

A simpler and more efficient way of formulating the above is this

You are absolutely right about this. Hmm, the example is too artificial..

If you were trying to actually calculate expensive derived data based on both state and props with this library, how are you supposed you do it?

That's exactly what I tried to show in the example.

const Component = ({ id }) => {
  const { score, title } = useSelector(useCallback(memoize(state => ({
    score: getScore(state),
    title: state.titles[id],
  })), [id]));
  return <div>{score.score} {score.createdAt} {title}</div>;
};

getScore is common part across components and the function in useCallback is specific to that component.

Does it make sense? sorry, if I misunderstand your point.

Thanks for the quick response! I guess one way to try to explain the thing I'm asking about is if we tried to do this:

const Component = ({ id }) => {
  const { score, title } = useSelector(useCallback(memoize(
    state => ({
      score: getScore(state),
      title: doExpensiveWork(state.titles[id]),
    },
    { size: 5 }
  )), [id]));
  return <div>{score.score} {score.createdAt} {title}</div>;
};

Now, even though I've increased the cache size, if I invoke <Component /> and pass in a prop that alternates between two values, I will never receive a memoized result for title. The "cardinality" of the memoization is only for the selector state, not the props. This is a problem that re-reselect attempts to resolve, for example. Maybe this is outside the scope of what this library attempts to do, but in practice I feel like it's not that rare to get a derived state from both redux/selector state and props.

Let me ask for clarification. doExpensiveWork(state.titles[id]) Do we require to invoke for every ids? With this style, we need to. If it can do the expensive work only on state without props, we could split the function. What would the function implementation look like? That would help us reach some useful conclusion.

You're right, id is not really a useful reference point here -- presumably it's something unique to each item rendered and relatively stable.

Maybe a more useful example would be something like this: we're rendering a markdown document to HTML (the expensive work). The markdown source lives in state, but we also have a piece of information that comes from the component props (say, a string to be highlighted in the HTML output by the markdown renderer). Let's say the source document is a relatively stable value, and the string to highlight typically only varies between a few different values. If both these pieces of information were sourced from the state, we could set the memoize cache size and benefit from it. But when the dependent values come from different sources, it's less clear to me how to proceed in this scenario. Does that make a bit more sense? With reselect/re-reselect, the selectors take an additional parameter which could be props; with proxy-memoize, only a single parameter seems to be accepted, so on a practical level it's not clear whether there's a way to express this (beyond the technique above of closing over the prop value(s) with useCallback or a React memo call, etc, which again mean they don't share a common memoization/caching mechanism).

Assuming highlighting applies to the HTML output, not markdown source. (because otherwise, the expensive work needs to be done every time hightlighting changes.)

const getHTML = memoize((markdown) => expensiveWork(markdown));

const getHighlightedHTML = memoize(([state, props]) => {
  const html = getHTML(state.markdown); // only re-evaluated when state.markdown changes
  return highlightHTML(html, props.highlightKeyword);
});

So, we solve this kind of use cases by nesting memoize. useCallback is just a way to follow this pattern.

Ah, this makes a lot more sense to me now! Thanks for taking the time to explain it.

While you are here, would you be interested in opening a PR to improve README? I don't have much background on the real use cases of reselect, and you could help on it.

Yes, I think I would be up for this! I may spend a bit of time trying proxy-memoize in my project first to make sure I better understand how to use it idiomatically. It's been a while since I've used reselect myself.

@dai-shi Thank you very much for the work done!
This is very valuable and can indeed replace the very outdated reselect approach!!

Hi @lostfictions ! Do you have a time to work with this?
I mean adding some helpful examples to compare with reselect / re-reselect?

For me, the example in the Readme looks very frightening. In the community, no one does react like that, so it can scare off new users of the library, since they will consider it as the only possible use case.

From the point of view of Reselect and Re-reselect - the most convenient to use with React and component properties - was the ability to move selectors logic away from component.

In the example in the Read.me, you described the selectors inside the component - it doesn't look very good and bloats the component files.
Therefore, it will be very cool to remove this example and show that we can pass not only one State, but also other values -id, etc. ​​(as you indicated - using an array) and describe the logic of the selectors outside the react components

@vlanemcev Thanks for coming.
Can you elaborate your concern a bit more? I think I almost get it, but it would be nice to hear more.
If you are talking about the "Usage with React Redux" example snippet,
I agree it doesn't represent real use cases. It rather shows edge case. Yeah, we should improve README.
But, if it depends on a prop, useCallback is handy. Otherwise, we'd need a Map and garbage collect by ourselves.
Would you be willing to provide a rough idea/code snippet? We can then polish or discuss based on it.

@dai-shi
Yeah, sure

Cache Size and Selector Instance Reuse

Heres example from react-redux useSelector() hook docs:

import React from 'react'
import { useSelector } from 'react-redux'
import { createSelector } from 'reselect'

const selectCompletedTodosCount = createSelector(
  (state) => state.todos,
  (_, completed) => completed,
  (todos, completed) =>
    todos.filter((todo) => todo.completed === completed).length
)

export const CompletedTodosCount = ({ completed }) => {
  const matchingCount = useSelector((state) =>
    selectCompletedTodosCount(state, completed)
  )

  return <div>{matchingCount}</div>
}

export const App = () => {
  return (
    <>
      <span>Number of done todos:</span>
      <CompletedTodosCount completed={true} />
    </>
  )
}

This example shows how easy we are to pass an additional argument for our selector, which is obtained from the properties of the component. In most cases, this is the most common use case for selectors - we need to get some data by ID or another additional property, that doesn’t present in our Redux Store.

In the READ.me of this project, a very heavy example is shown, which will seem very cumbersome to users of reselect.
This is because people are used to describing the logic of selectors next to the reducers and actions themselves:

export const getTeamPlayerSelector = createSelector(
  (_: RootState, id: ITeam['id'] | undefined) => id,
  getPlayersSelector, // will return array with all players
  (id, players) => players.filter((player) => player.profile.team_id === id),
);

And then in the Component we will use something like this (custom example):

const TeamPlayersList: TeamPlayersListComponentType = ({ t, navigation }) => {
  const { data, isLoading, isOnline, isRefreshing, refreshResources } = useReduxFetch(
    (state) => getTeamPlayersSelector(state, navigation.getParam('id')),   ---> this is selector function 
    fetchTeamPlayers,
    navigation.getParam('id'),
  );

....
}

In your example above https://kind./github.com/dai-shi/proxy-memoize/issues/14#issuecomment-783050477
I saw something similar to this - we passed props to our memo function.
This is not very obvious and there was no such example in the documentation, so I think it is worth adding an example of this

Composing Selectors

Another not valid example in real life:

image

In reselect we really love composing of selectors. It is because we want to re-use our selectors in any reasons.
We just want to use our selectors as reusable data getters. We want to create some kind of specified selectors that will depend on some basic ones in which the transformation is done.

Therefore, the example above is not very good because it seems that we cannot use composite selectors anymore and must write everything directly in the selector functions every time.

But as far as I understood from your message above, this is not a problem. We can use nested memoization.

Debugging Selector Recalculations / Relationships and nesting

What do you think - is there any chance of doing something like these packages?

I also wanted to ask - do you have any ideas how we can compare memoization using proxy-memoize and reselect / re-reselect on some data ?

Thanks for those clarifications.
Let me start with one.

In reselect we really love composing of selectors. It is because we want to re-use our selectors in any reasons.

If the reason is to organize selectors, it's totally fine.

const subtotalSelector = state => state.shop.items.reduce((acc, item) => acc + item.value, 0);

const taxSelector = state => {
  const subtotal = subtotalSelector(state);
  const taxPercent = state.shop.taxPercent;
  return subtotal * (taxPercent / 100),
};

export const totalSelector = memoize(state => {
  const subtotal = subtotalSelector(state);
  const tax = tasSelector(state)
  return { total: subtotal + tax };
);

This works.

You might ask this is duplicating calculation.
Yes, it is. But, the exported totalSelector is optimized. This is probably counter-intuitive to reselect users.

You could nest memoize.

const subtotalSelector = memoize(state => state.shop.items.reduce((acc, item) => acc + item.value, 0));

const taxSelector = memoize(state => {
  const subtotal = subtotalSelector(state);
  const taxPercent = state.shop.taxPercent;
  return subtotal * (taxPercent / 100),
});

export const totalSelector = memoize(state => {
  const subtotal = subtotalSelector(state);
  const tax = tasSelector(state)
  return { total: subtotal + tax };
);

Unless the internal selectors do heavy computation (in these cases, it depends on items size),
this is not recommended for performance. Again, this might be counter-intuitive.

So, what do you think?
Should we show the first example instead of the one in README, or the second one?

Do people care performance or organizing code?

This is the hard part of this library. It replaces reselect use cases in React, but the mental model has to be changed. (Or throw away the model of using memoization to get stable object references.)

@dai-shi Thank you very much for you explanations!

Should we show the first example instead of the one in README, or the second one?

I think we should show 2 examples that you described above with some explanation too.
This is because it will be developers responsibility to understand how to use selectors.

For example if we know that we will use this selector anywhere else

const subtotalSelector = memoize(state => state.shop.items.reduce((acc, item) => acc + item.value, 0));

we should wrap it in memoize() to have better performance and take care about our re-renders.

Can you please tell us what performance problems can be if we use nested memoize functions??

Can we disassemble the principle of operation of memoize () in detail, line by line, in order to understand how it will work “under the hood”? I mean - what happens if we use nested memoization - how will it work?

For example for this selectors:

const subtotalSelector = memoize(state => state.shop.items.reduce((acc, item) => acc + item.value, 0));

const taxSelector = memoize(state => {
  const subtotal = subtotalSelector(state);
  const taxPercent = state.shop.taxPercent;
  return subtotal * (taxPercent / 100),
});

export const totalSelector = memoize(state => {
  const subtotal = subtotalSelector(state);
  const tax = tasSelector(state)
  return { total: subtotal + tax };
);

@dai-shi Btw, I see a lot of potential in this library. It just needs to be promoted at a more formal level.
We need to go back to the Discussions post on the next development of re-select and ask Mark to add it to the official Redux documentation.

It just needs to be promoted at a more formal level.

Agreed. I appreciate for your help.

we should wrap it in memoize() to have better performance and take care about our re-renders.

This is the misconception if I understand what you are saying.
To just avoid re-renders in React, you don't need to wrap it in memoize().
If you do, it's an overhead.

Can we disassemble the principle of operation of memoize () in detail, line by line, in order to understand how it will work “under the hood”?

Let's start with simpler one.

const sumSelector = memoize((state) => {
  const x = { sum: state.a + state.b };
  const y = { sum: state.obj1.c + state.obj2.d };
  return { sum: x.sum + y.sum };
})

When you run sumSelector(...) for the first time.
It invokes the function, records tracking info and returns the result. It saves the input state and the result too.
In this case, the tracking info is like .a, .b, .obj1.c and .obj2.d.
Then, you run sumSelector(...) again.
Now, it compares two states, the previous one and the current one.
But it compares only the paths in the tracking info.
If it finds at least one change in a path, it will re-invoke the function and do the same.
Otherwise, it returns the previous result.

Please note that in this scenario, we didn't discuss object ref identity at all.
It doesn't matter how many new objects are created inside the function.

If this is clear enough, I can explain the nested scenario. But, I'd rather stop here and wait for questions.

@dai-shi Thanks for explanation!

Yeah, the example above is enough for understanding, could you please continue to describe nested example and explain why will be performance issues with nested memoize?

Okay, so let's see the example without nested memoize.

const subtotalSelector = state => state.shop.items.reduce((acc, item) => acc + item.value, 0);

const taxSelector = state => {
  const subtotal = subtotalSelector(state);
  const taxPercent = state.shop.taxPercent;
  return subtotal * (taxPercent / 100),
};

export const totalSelector = memoize(state => {
  const subtotal = subtotalSelector(state);
  const tax = taxSelector(state)
  return { total: subtotal + tax };
);

If you run totalSelector({ shop: { items: [{ id: 1, value: 100 }, { id: 2, value: 200 }], taxPercent: 10 } }),
the tracking info will be .shop.items.length, .shop.items[0].value, .shop.items[1].value, .shop.taxPercent. (it may include some more for array.prototype.)
The behavior is just the same as the previous example.
So, whether totalSelector re-evaluates the function does not depend on if totalSelector is composed from smaller pieces or just a big one. It's totally up to developers.

Now, if we have nested memoize, we need something tricky.

const subtotalSelector = memoize(...);

const taxSelector = memoize(...);

export const totalSelector = memoize(state => {
  const subtotal = subtotalSelector(state);
  const tax = taxSelector(state)
  return { total: subtotal + tax };
);

Now, what you pass to subtotalSelector is a proxy object, because that's how it collects tracking info.
So, if you have nested memoize, you will wrap proxies over and over, like proxy(proxy(state))).
This can't be avoided because we need to collect tracking info for subtotalSelector, taxSelector and totalSelector separately.
What's more difficult is when you re-evaluate.
Suppose we secondly invoke totalSelector({ shop: { items: [{ id: 1, value: 100 }, { id: 2, value: 200 }], taxPercent: 11 } }).
We only change the taxPercent.
From totalSelector point of view, because taxPercent is changed, it will re-evaluate the function.
taxSelector will also re-evaluate, so not a big deal.
subtotalSelector will not re-evaluate the function, because nothing in the tracking info is changed.
So, it will return a cached result.
Now, because subtotalSelector doesn't run the function, from totalSelector, it is just NO-OP.
This means, without any treatment, the tracking info after the second run becomes .shop.taxPercent only.
This is incorrect and if we run a new state with different shop.items, subtotalSelector doesn't run the function because the previous run was NO-OP.
To avoid this, we have a special treatment, which is if the inner memoize returns a cached result, it will also propagate the previous tracking info to the outer memoize, all recursively.

That said, this tracking info propagation and nested proxies should be very lightweight.
And, this library does support nested memoize scenario. But, it doesn't come with zero cost.
Okay, I think this isn't going to be noticeable cost.
So, I'm okay to change README to add nested memoize use case for reselect users.

Please feel free to ask questions for clarification.


btw, i don't know whether the first example without nesting or the second one with nesting is more performant. it's running array.reduce twice vs. wrapping proxies twice. if items are large the second one wins, if the items are just a few, the first one may win.

I assume this issue is not very active now.
While the issue is not solved, let's continue with new issues. Just created two:

Please feel free to open new issues for questions and suggestions.