SebKrantz/collapse

Replace left table values by right table values in left-join on match

statquant opened this issue · 3 comments

Hello and happy new year.
In kdb a very useful feature of lj is that when a the right table keys are matching the left table the values of common columns are taken from the right table.
This feels quite 'natural' you essentially update.
While this is doable right now by keeping the 'match' column and overriding values (by the way it might be better to use TRUE/FALSE instead of character?) it implies some overhead.
Would you consider adding the feature?
Kind regards

Hi, can you provide an R example of this and indicate why it is useful. Currently all joins but the right join preserve the left tables id columns. Coercions to the largest common type if join columns have different types are done internally in fmatch(). I believe this is a quite natural setup....

D3SL commented

What statquant is describing is closer to an "upsert" than a true left join. The main use case is when you have multiple tables of overlapping data and want to overlay them in order to produce a single final product. It's an extremely useful operation when dealing with real world data that's come from multiple sources or was inconsistently maintained. In a way it's like a reverse fcoalesce that keeps the last rather than first value, and operates at a table level.

There are ways to approximate this in data.table, dplyr, and other libraries it's not exactly this and isn't quite clean or parsimonious.

Thanks. I think this would be implemented in a different function then. For the moment this is not a priority, so I can keep it open but don't think an implementation will follow in the near future.