joequery/Stupid-Table-Plugin

weird behavior when td have empty values

Closed this issue · 13 comments

est commented

Hi,

I have a table where some tr have empty td, when sorting as int or float, can we put a default value of 0 instead of NaN in Javascript?

Change this

"float": function(a, b) {
  return parseFloat(a) - parseFloat(b);
},

into this

"float": function(a, b) {
  return (parseFloat(a)||0) - (parseFloat(b)||0);
},

Empty td will fuck up normal number td sorting. Because NaN - 1 ==== NaN in Javascript. But in table sorting it's better if we think 1 is greater than NaN

@est, zero doesn't have the same semantic meaning as null / NaN / blank values, except perhaps in your particular application.

I think using the data-sort-value attribute would be better in your situation than altering the sorting code behavior. If something is blank, and if in your application it makes sense to treat that as zero, then you can use data-sort-value="0" and data-sort="int" without alterning the plugin code. The customized sort functionality is already implemented, and should address this situation.

The data-sort-value also allows you to leave your cells blank (visually) if you want, while still making them consistently sortable.

@jefflunt While that is true, I would think treating blanks as zero (for numerical sorting) or the empty string (for string sorting) is the most common expected behaviour for developers.

In other words, the developers who don't want it can use custom sort functions.

@svivian TLDR: values with no integer equivalent (a blank string, for example) have undefined sort order for a reason: undefined is undefined, and in programming undefined behavior has a specific meaning in itself. Treating a blank as zero may sound plausible as a default, but consider the rest (below).

The rest:

Arguing that most programmers will want to treat undefined values as having the same sort order value as 0 is unfounded. It sounds plausible on the surface, but falls apart upon further examination.

For example, sort the following list in order of their mathematical value:

[Infinity, -Infinity, NaN, null, undefined, parseInt(''), -0, 0, 3, 2, 5]

Assuming we're sorting in ascending order, while you might be able to argue that -Infinity should be first, Infinity should be last, and 0 perhaps somewhere in between, where exactly should it be in between? Since Infinity isn't actually a numerical value it's not perfectly straightforward, and that's just the start of things. If we remove +/-Infinity from the question, and treat the other ambiguous values [null, undefined, NaN, parseInt('')] as zero, then under that interpretation the sort value of each item in this list [NaN, 0, undefined, null, parseInt('')] are all the same. This is incorrect, because, among other reasons, saying that NaN == 0 for sorting purposes also implies that the following is true:

NaN < 1
NaN > -1

...which would imply the following sorting for NaN and numbers between -2 and 2 (inclusive):

[-2, -1, NaN, 0, NaN, -0, NaN, 1, 2]
         ^ ??? the  same ??? ^

...yet, logically this doesn't make any sense. It would only makes sense in the context of a specific application implementation where the programmer either doesn't care one way or the other, or when this behavior happens to be compatible with the application that's being written. Neither of these scenarios (the programmer doesn't care, or they're lucky that it's compatible with their existing thinking) is a good place to be.

This is the crux of why I think there should be no default sorting of non-integer values, when treated as integers: the consumer of the plugin must explicitly state a sorting preference, or else none should be assumed. The same arguments can be applied to floating points as well as integers, and in all cases treating them as 0 by default is incorrect.


To go a little further and show how things are sorted in a particular implementation of JavaScript: in my demonstration I'm using the Chrome console:

> [Infinity, -Infinity, NaN, null, undefined, parseInt(''), -0, 0, 3, 2, 5].sort()
  [-Infinity, -0, 0, 2, 3, 5, Infinity, NaN, NaN, null, undefined]

...but, if I swap the order of -0 and 0, as well as the order of Infinity and -Infinity in the input list:

> [-Infinity, Infinity, NaN, null, undefined, parseInt(''), 0, -0, 3, 2, 5].sort()
  [-Infinity, 0, -0, 2, 3, 5, Infinity, NaN, NaN, null, undefined]

...see that Infinity and -Infinity retain the same relative ordering to one another, but 0 and -0 do not: 0 and -0 stay in whatever order they were given in the input list. The reason, I suppose, is revealed in this test:

> 0 == -0
  true

Finally, note that while the relative ordering of NaN vs null, and Infinity vs NaN happen to be consistent in the examples above, the following comparison tests show that perhaps they're not strictly sortable, or at the very least that the sort values are undefined:

> NaN < null
  false
> NaN > null
  false
> NaN == null
  false
> Infinity > NaN
  false
> Infinity < NaN
  false
> Infinity == NaN
  false
> -Infinity > NaN
  false
> -Infinity < NaN
  false
> -Infinity == NaN
  false

And because it's fun to make a pun of this plugin's name: I'd prefer Stupid-Table-Plugin to stay simply, beautifully, stupid than try to get clever in this scenario.

Thanks for the reply.

Arguing that most programmers will want to treat undefined values as having the same sort order value as 0 is unfounded.

What is your foundation that most programmers don't want that? By the way, I wasn't claiming to know which one is more expected, just that I think parsing as zero would be.

"The rest" part of your comment suggests that most people using ST will be including nulls, infinity, etc in their tables which I am sure is not the case. In fact if I am not mistaken that can't be the case since an empty cell can only be parsed as one of those (in other words, one empty cell can't be parsed as NaN and the other as -Infinity).

The fact is, having empty cells causes strange/undefined behaviour. I think the number of developers who want that undefined behaviour in their tables is zero, or close to it. Developers that do want undefined behaviour can easily use a custom sort function to introduce that.

Thank you @svivian, also, for the reply and the conversation. The idea of there being any default at all is what really got me thinking, and as I examined the various things that could be a default, including 0, I came to the conclusion that having no default was actually correct.

What is your foundation that most programmers don't want that?

I didn't actually make any claim as to what most programmers want or don't want. I was simply making the argument that any default was incorrect, whether or not programmers (myself included) want or expect a default. If anything, I've made the argument that the current, undefined behavior of the plugin is universally correct, on the basis that undefined sort order as input should produce undefined sort order as output. Expectations, if they could even be fairly measured across programmers, have nothing to with this.

I would think treating blanks as zero (for numerical sorting) or the empty string (for string sorting) is the most common expected behaviour for developers.

First of all, I could have been clearer in my response to this, so I'll own that. When I commented that this was unfounded I was simply pointing out that you had made an assertion (what programmers might expect), but that assertion wasn't backed up in your comment. To put my question another way, "What experience/evidence makes you think this? If you think my logic is incorrect, no problem, let's hear your point of view." But I don't think that you've explained your thinking here.

Second of all, sorting a blank string is completely different since it already has a defined, and unambiguous lexicographical sort order. Not that it's a simple sort order depending on your locale, but it's not undefined, so that's a different case from sorting numeric and non-numeric values together, which is the scenario that the OP's suggested patch handles.

"The rest" part of your comment suggests that most people using ST will be including nulls, infinity, etc in their tables which I am sure is not the case.

I don't think I suggested anything about the data going into the table; that's the point. I tried to stay away from assuming what might or might not be in a table in the first place.

However, had I made that suggestion, this would be my reply: In regards to the commonality of blanks being in tables, I see three possible scenarios:

The blanks are intentional

If the blanks are intentional then either (a) the plugin consumer must not care how they are sorted, or (b) they know precisely how they'd like them sorted.

In the case of (a) why should the plugin then apply its own sorting if the plugin consumer doesn't care?

In the case of (b) the plugin consumer ought to use the data-sort-value to make it clear what their sorting needs are.

Neither case requires the plugin to change its current behavior.

The blanks are unintentional

If it's unintentional, then the consumer of the plugin has a bug, and they should fix it. The plugin shouldn't try to think about theoretical bugs and then provide defensive behavior in order to get consistency in sorting. The bug is the concern of the plugin consumer, not the plugin itself.

No one ever puts blanks in a table

If no one ever put blanks in a table then this entire issue would be moot. There would be no non-numeric values to argue about.

However, this github issue wouldn't exist unless sometimes blanks do appear in tables. So, if they do, then it's either intentional (the consumer of the plugin should handle it by making sorting explicit), or unintentional (the consumer of the plugin has a bug and needs to fix it).

(in other words, one empty cell can't be parsed as NaN and the other as -Infinity)

No argument here. There's no scenario I can think of where you'd want blanks to be parsed into different non-numeric values, especially if those non-numeric results still aren't strictly sortable.

I think the number of developers who want that undefined behaviour in their tables is zero, or close to it.

Again, no argument here, but you're still not laying out your argument for how you came to this conclusion. It's just a bald assertion.

I'm not saying that anyone wants unreliable sorting. I'm saying it's either (a) their job to do something about it via the data-sort-value attribute, or (b) do nothing and simply accept that if you put non-sortable items into a sorting function that the output is inherently going to be undefined.

I have absolutely no problem with the scenario that someone, in their application, wants to treat non-numerics as some default value that makes sense for them. But it's their application's responsibility to choose a default appropriate to their application, not the plugin's responsibility to choose a default for all theoretical applications.

est commented

The reason I want this bug fixed not because engineering problems, but because user experience.

Clicking the table column header is weird when having empty td. Let's use pure javascript for demostration. Note I swap function(a, b) to function(b, a) because the user wanna sort/reverse when clicking the table column header

> var a =[66, 95, 21, 80, 62, "", "", 29, 57, NaN, undefined, '', 28, 69, 7, '',  82, 79, NaN, undefined, NaN, 15, 20, 47, 98, 74, '', '', 71, 74, 11]
undefined
> a.sort(function(a, b) {return parseFloat(a) - parseFloat(b)})
[7, 11, 15, 20, 21, 28, 29, 47, 57, 62, "", "", "", "", "", "", 66, NaN, NaN, NaN, 69, 71, 74, 74, 79, 80, 82, 95, 98, undefined, undefined]
> $_.sort(function(b, a) {return parseFloat(a) - parseFloat(b)})
[98, 95, 82, 80, 79, 74, 74, 71, 69, 66, 62, 57, 47, 29, 28, 21, 20, 15, 11, NaN, "", "", "", 7, "", "", NaN, NaN, "", undefined, undefined]
> $_.sort(function(a, b) {return parseFloat(a) - parseFloat(b)})
[7, 11, 15, 20, 21, 28, 29, 47, 57, 62, 66, 69, 71, 74, 74, 79, 80, 82, 95, NaN, "", "", "", 98, "", "", NaN, NaN, "", undefined, undefined]
> $_.sort(function(b, a) {return parseFloat(a) - parseFloat(b)})
[98, 95, 82, 80, 79, 74, 74, 71, 69, 66, 62, 57, 47, 29, 28, 21, 20, 15, 11, NaN, "", "", "", 7, "", "", NaN, NaN, "", undefined, undefined]
> $_.sort(function(a, b) {return parseFloat(a) - parseFloat(b)})
[7, 11, 15, 20, 21, 28, 29, 47, 57, 62, 66, 69, 71, 74, 74, 79, 80, 82, 95, NaN, "", "", "", 98, "", "", NaN, NaN, "", undefined, undefined]

You see, the row with data 98 or 7 appears out of empty row above and below is simply weird.

@est - This is not a bug. To get your table to sort the way that you desire when you submitted this issue, simply specify a data-sort-value attribute of 0 for every NaN in your table and you're done.

The problem isn't that the implementation of the sorting function is wrong. The problems is that you're passing non-sortable values into a sorting function, and complaining that the output doesn't make sense. Of course it doesn't make sense - you cannot sort non-sortable values. But globally substituting all non-sortable values with 0 is a hack that doesn't address the actual issue - that you should change your input to the function, not the function implementation.

Your above example of sorting "weirdness" supports this point. If you remove the non-sortable stuff from the array (all "", NaN and undefined values), you'll see that the integers that remain are, in fact, in the correct order already.

Also I add for me is a good thing that there is no deafult value for empty cells.

some time is handle empty cells as -1000000 other times as zero, depending what does it mean if the cell is empty... in some case, an empty value is A LOT LESS than zero., other times is exactly zero.

So I use, and you could use, the data-sort-value as told before.

Please don't change this behaviour or a lot of my websites will start to do some strange things !!!

I appreciate everyone's input into this matter. I believe both sides of the debate have merit, and everyone presented their ideas extremely well. The essence of the issue boils down to technical correctness vs user expectations of things "just working".

I imagine a large portion of stupid table users are not familiar with the idea of undefined behavior, as it's not something particularly stressed in web development libraries/frameworks. Anyone who's worked with entry-level C programming, for example, has experienced dealing with functions explicitly defined over a particular set of values (i.e. the non-negative integers). These functions rarely stop you from passing in invalid values, but the moment you do so, the program as a whole exhibits undefined behavior. Those who are strictly webdevs have possibly never had to deal with such responsibility, so I can see why things not "just working" can seem like a bug.

I agree 100% that "empty" does not semantically equal 0. Although 0 is the most likely numeric candidate for an equivalence with "empty", it's an assertion I'm not comfortable making in the source.

And @realtebo's note on breaking backwards compatibility is an entire other issue I'd like to avoid.

So thanks to everyone for taking the time to construct their opinions in a clear and informative manner. I'm choosing to close the issue, but perhaps some clarification in the documentation is warranted.

tagr commented

@joequery pointed me from #124 to this thread. I know this issue is now closed, but I would only like to posit the idea of enabling this behaviour as a config option, which could be passed in, and not break backwards compatibility. There seem to be use cases for this, and could add value to the plugin.

    optionSortNullsAsZeroes: false;

In my opinion, the sorting functions accounting for null values is the configuration. I do think the docs need some cleanup, and they should probably mention ways to handle invalid sort data.