data-forge/data-forge-ts

merge() results in concat if index is a Date object

syldman opened this issue · 3 comments

Merging dataframes indexed by Date object results in concat, since comparing two Date objects requires getTime()
(yet, pivot() on the Date object column works well)

const {DataFrame} = require('data-forge');

const dfA = new DataFrame({
    columnNames: ["Col1", "Col2", "date", "dateString"],
    rows: [
        [1, 'hello', new Date(...('2023-06-01'.split('-'))), '2023-06-01'],
        [5, 'computer', new Date(...('2023-06-02'.split('-'))), '2023-06-02'],
        [10, 'good day', new Date(...('2023-06-03'.split('-'))), '2023-06-03'],
    ]
});

const dfB = new DataFrame({
    columnNames: ["Col1", "Col2", "date", "dateString"],
    rows: [
        [1, 'hello', new Date(...('2023-06-01'.split('-'))), '2023-06-01'],
        [5, 'computer', new Date(...('2023-06-02'.split('-'))), '2023-06-02'],
        [10, 'good day', new Date(...('2023-06-03'.split('-'))), '2023-06-03'],
    ]
});

const dfMergedDate = dfA.setIndex('date').merge(dfB.setIndex('date'))

console.log('dfMergedDate', dfMergedDate.toPairs().length)
// 6; comparing two Date objects

const dfMergedDateString = dfA.setIndex('dateString').merge(dfB.setIndex('dateString'))

console.log('dfMergedDateString', dfMergedDateString.toPairs().length)
// 3; comparing two strings

Hi @syldman, thanks for your feedback and the detailed example code.

I realised that there was actually no test case for the example you describe!

So I added that and you can see it here:
b2b4fd5

I was trying to figure out why my test works and your example doesn't.

Then I realised that if I create separate Date objects (at the same date/time) they don't compare!

Wow, all these years I just have never seen this problem before in Data-Forge.

I have a failing test now though so hopefully I'll have this fixed soon.

I got a fix published. It's in Data-Forge 1.10.1.

I ended up calling toString on each index value used in the merge function, that allows Date objects to be compared!

JavaScript is pretty weird sometime.

You can see the fix in this commit: 7a061a8

Please reopen this issue if the fix doesn't work for you.

Hi @ashleydavis. Thanks for the really quick fix!
Actually I've ended up using date string for the index. Do you see a real need to allow non-string, non-number indexes? Since indexes are only used for the lookup, not allowing non-string/non-number indexes might actually speed up things, as you could drop toString from those operations.