deitch/searchjs

Just flat objects?

janober opened this issue · 19 comments

Hello,

just found that library and it really looks great. Would love to use it. However all the examples use just flat objects. What if I have multiple levels? Or if I have multiple levels and one of them is an array?

So something like this:

[
  {
    name: 'Frank', 
    address: {
      city: 'munich'
    },
    cars: [
     {
        brand: 'bmw'
      }
    ]
  }
];

Is there a way to filter for "address.city = munich" and "cars.brand = bmw"?

Something totally else. Even if the above is possible I would still not be able to use it. The project does not seem to have a License. Not sure if that is on purpose but if you want that other people start using it you should probably add one.

Thanks!

Hi @janober thanks!

Hmm, so we want to do deep searches. Instead of just {name: "Frank"}, we want to search that "address.city = 'munich'" (object match) or "cars.brand = 'bmw'" (match object within an array).

Interesting. First blush, I like the idea. We have to make sure that:

  1. It doesn't break any of the other semantics of functionality
  2. That the search description works in and of itself.

For example, {"address.city": "munich"} would work, but what if the structure is:

{
  name: 'Frank',
  "address.city": 'munich'
}

Admittedly, that is less likely, but it still is something to think about. Is there a different syntax that would cover it?

As for the license, just an oversight. I thought the MIT license was there, but apparently not.

Wow thanks for the fast answer!

So it sounds like its currently not possible ;-)

Ah yes, it is true if there is a dot in the key it would cause problems. Did not mean that it has to work that way, that was just an example of how I somehow would have expected it to work (and what I unsuccessfully tried). Dot notation is already out there and I think widely used and immediately obvious what is intended. It would also possible to use something else like "->" or if you want to make sure that there is no danger of collision at all also a syntax like that would probably work fine: {address: {city: 'munich'}}

Ah great to hear about the license! Thanks!

Uh just looked a little bit more at the syntax. The example would cause problems when fields are named "from", "to", "_not", ... however something in that direction would still be possible. Like for example changing the operation-keys to start with "$". However also that could cause again collisions with the data ;-) So guess no matter what there could always be problems.

I think "no possible" might be a bit premature. I like the idea, and would like to think it through here to see if we can make it possible.

The original assumptions were the following (some explicit, some implicit):

  • Keys in search object are the keys in the matching object
  • Values in search object are the values to match in the matching object
  • The Values can be a string (value match), array ("one of" match), range (object match)

In order to get deep object matching, we need to do one of the two things you suggested:

  • change the key to reflect a deep key, e.g. "address.city", and then understand that the string "address.city" really means to match the key "address" which should be an object and have a key in that object of "city".
  • change the value of the match, so that we match "address" if the value is {city: "munich"}

The first option messes up potential dot-notationed single-layer (flat) keys; the second messes up with the from to not fields, as you said. The current flat implementation has no problem with it, since those are only embedded in the search Values object, so it knows, "this is an object, not a data primitive (string or number), so don't try to compare it."

Still, all in all, either format (dotted string notation key or deep object value) would work, if we can find a way to avoid a conflict. Personally, I like the deep object, but not for any reason beyond, just because.

The one thing I cannot do is change the existing syntax, because it might break existing projects that depend upon it.

Yes sounds good. Lets see if we find a proper solution. However I would be not be to afraid about breaking existing projects. As long as 1. The major version changes, 2. The change is quite simple and 3. It is well documented.

I also hate it when suddenly stuff breaks, however in my opinion having a clear and obvious syntax is in the long term way more important then having a 5 minute fix on a major version upgrade. So if you would for example go the deep-object way and you would decide that it is not "from" anymore it is now "$from", "$$from" or whatever the change would be quite simple and would be really worth it. Sometimes backwards compatibility is important, not sure if it is that important in that case.

Ok so much to that. Thought it would be a good idea to look what other similar syntax resorted to. It is for sure not the first time it came up. So I had a look at MongoDB (http://docs.mongodb.org/manual/tutorial/query-documents/). They seem to allow dot notation & deep-object style. However they for example prepend all operators with "$". So that is maybe really the best thing to do.
If not that, the dot-notation would probably be the easiest. With that no change in the current syntax would be needed. And to be honest, using a dot in the key somehow simply feels wrong and I personally would never do it. However for that kind of people an additional option could be provided to matchObject & matchArray.
Which could either be to:
-deactivate dot-notation totally (then it works exactly like right now)
or
-allow them to overwrite the separator with whatever they want. So they use their dot-key-names and can set the separator to "->" or "**sub_level_starts_now**" or whatever they feel like.

I am less hesitant to break stuff with major version changes if we are talking about a product with an API. A library though leaves me more wary. I am willing to put in more effort to avoid it.

Looking at mongo was a good idea. Despite some of my misgivings with the product itself (mostly based on major issues with the earlier versions, as well as my dislike of any overhyped product, which causes using it in the wrong circumstances, which is no fault of Mongo's), it is quite popular so a decent indication of what people are and are nor willing to work with.

So Mongo basically punts on the "key has a dot in it" problem. They say, "we do not support it". So if I want to match on the record {'a.b': 1, 'c': 2} on the 'a.b' field, Mongo says, "sorry, you simply cannot, because {'a.b':1} will look for {'a':{'b':1}}"

Of course, building objects with fields like that is not common practice, even if 100% legitimate.

We could go full-bore and adopt the Mongo notation or a subset of it. Not sure how I feel about it.

Clearly, Mongo's architects thought through the query language rather thoroughly.

Just looking through the Mongo docs. They actually simply forbid the usage.

http://docs.mongodb.org/manual/reference/limits/#Restrictions-on-Field-Names

Restrictions on Field Names
Field names cannot contain dots (i.e. .) or null characters, and they must not start with a dollar sign (i.e. $). See Dollar Sign Operator Escaping for an alternate approach.

But sure that would be different in your case. Because in the Mongo case people normally design their data with it in mind while searchjs is ment to simply support everything. So really think a simple option to overwrite the default separator "." would be the easiest and would work for everybody.

Good catch. Yeah, if you restrict the names, really easy afterwards to use the restrictions as key characters!

So really think a simple option to overwrite the default separator "." would be the easiest and would work for everybody.

What does that mean?

What I wrote two posts ago. To simply choose "." as default separator but give people the choice to overwrite it by giving a different one as option. So for example:

var list = [
  {
    name: 'Frank', 
    address: {
      city: 'munich'
    },
    cars: [
     {
        brand: 'bmw'
      }
    ]
  }
];
matches = s.matchArray(list,{"address.city":"munich"}); // Uses "." as separator because that it the default

or

var list = [
  {
    name: 'Frank', 
    address: {
      city.name: 'munich'
    },
    cars: [
     {
        brand: 'bmw'
      }
    ]
  }
];
matches = s.matchArray(list,{"address->city.name":"munich"}, {"separator": "->"}); // Separator got overwritten to be "->" so people can use dots in their keys

Oh, I get it. For 95%+ of the case (probably 99%+), it will work just fine. For the few cases where someone actually used a '.' in the key, they can change it. Sort of like awk -F: separates fields on :.

I think that makes sense, and is a good starting point.

Need to set up tests for it, then implement.

Yes, exactly. Think the is not just easy syntax wise it also means that nothing else in the jsql-syntax has to change. And should probably work just fine for every person that already uses it, and if not it is very easy to fix by simply changing the separator.

OK, tests were easy. But the logic of telling JS to match "a.b" to an object of {a: {b: "foo"}} is a little harder. Ideas?

Ha! It works!

Done. 0.3.7 pushed out and published to npm

Sorry did not see your first message. Was sleeping.

Great thanks! Will give it a try later!

I tried it but realized that my example "cars.brand = bmw" from the original question did still not work. So fixed that and created a pull request #4 .

Ah just saw the version number you updated to. You updated just the bug-fix version. That could be dangerous. This is the part the most people (including me) allow to be auto updated. That means when a new version gets released and just the bugfix version changed npm or bower would install that version on "update". In this case like you said that the change "could" break something (even if just in 1-5% of the cases) it would then for example now break the project of somebody when they have dots in their keys.
For that reason normally at least the minor-version should changed. This is for new features and should be fully backward compatible.
Therefor one could even argue that the major version should changed because it is not fully backward compatible, however think in that case it should probably be ok. The version is < 1.0.0 so it is considered beta or alpha and there breaking changes in minor-version-numbers are normal.

Doh! Just sort of fixed, bumped version and pushed it out.

Truth is, this should have gone 1.0.0 a while back. I am going to make this 0.4.0, fix the tests (they are framework-less now, want to move them to mocha for better running and reporting), then move to 1.0.0

Done for 0.4.0