misoproject/dataset

Types and Strings

Closed this issue · 4 comments

There is a lot of confusion surrounding Miso.Types. For instance:

"1" is a Number and not a String. If the intention here is to allow Numbers and Booleans also be Strings, that's not going to work as expected. For instance the following code was brought up by pmjs1 in #misoproject:

var ds = new Miso.Dataset({
  data: [
    { one : 1, two : 4, three : 7 },
    { one : 2, two : 5, three : 8 }
  ]
});

ds.fetch.then(function() {
  // Create update attrs object
  var a = { fours: "" };

  // Add a new column which is of type string
  this.addColumn({
    type: 'string',
    name: 'fours'
  });

  this.update(function(row) {
    a.fours = row.one.toString();
    return true;
  }, a);
});

The error message that is reported:

Uncaught incorrect value '1' of type number passed to column with type string

is triggered, because "1" is recognized as a number in your Miso.typeOf check, which resides inside the update function.

My patch to him was to change the test function for number to:

test: function(v) {
  return typeof v === 'number';
}

I humbly suggest changing both boolean and number types to respect their native JavaScript types. This will prevent the above kind of issues where it's impossible to insert a String value if it can be regex tested to a Number.

I'm also super confused about the test functions for each type. I see the following (specifically string here):

test: function(v) {
  return (v === null || typeof v === "undefined" || typeof v === 'string');
}

When you test v === null || ... if the value is null it's going to return true for that type. Is this expected behavior?

iros commented

Interesting problem guys. So a few thoughts:

  1. The test methods are used to detect the implicit type of something that is most likely coming in as a string (from a csv, for example.) This is why we are using regular expressions as well as type checking.
  2. The reason we allow nulls through is because we want to be able to allow you to have empty values in a column while still maintaining its type.
  3. By restricting one's ability to insert the wrong data type, we are hopefully maintaining the consistency of the data. Why would you want to insert a "1" string into a string column? If all you're inserting are numbers, than it should be a "number" type.
  4. This does raise the question of whether newProperties on update should support it being a function.
    @alexgraul? thoughts?
iros commented

Promoted ticket #105.

I'm wondering if there is another way of determining null checks without doing it inside the actual type check. Basically doing the null check outside of the test function where its necessary (for maintaining null column values).

I think the issue is more than "why would you want to", versus what could easily happen based off input coming from a third-party location. What happens if you get a tweet text where the only value is "true" or "555555" this shouldn't break the column type String.