ljharb/qs

Parse booleans & numbers

xpepermint opened this issue Β· 58 comments

// now
Qs.parse('a=true');    // { a: 'true' }
Qs.parse('a=21.31');    // { a: '21.31' }
// but I would expect boolean
var obj = Qs.parse('a=true');    // { a: true }
Qs.parse('a=21.31');    // { a: 21.31 }

Currently I use express-query-boolean package which parses true and false strings into boolean but it would be great to have this feature included in qs.

nlf commented

My current feeling is that this belongs elsewhere. There are so many conversion patterns that could take place that I don't really want to open the can of worms of implementing them all here. Unless there's a large demand for this, I'm going to say no.

Leaving open for discussion, however.

OK, I made a plugin for now https://github.com/xpepermint/query-types.

I think the @xpepermint request is valid. I was disappointed to see that this isn't a feature. That parse assumes all numbers are strings makes it not so helpful.

This is actually a problem I am facing using the very popular https://github.com/request/request library. What I am running into is that my REST api is outputting booleans as true and false. However, when a PUT request is made, request.js uses this library to convert it to a string, which this then turns all of my booleans into strings breaking the app. I am curious why this library is converting booleans and numbers to strings. Its a pretty easy fix, and I am willing to submit a pull request.

Part of me wants this too, but automatic type coersion here is troublesome. What if you wanted "true" as a string? Or "21.33" as a string? I often deal with currency amounts formatted as strings to avoid floating point precision loss.

In particular if you're allowing user input to show up in the query string, all someone has to do is type in true where you were expecting a string and good luck avoiding a crash there.

At least the current behavior is predictable and very well defined.

Thanks for the information. I am no longer blocked by this, so I think I can get by without this change. I was actually able to get around this using https://github.com/visionmedia/superagent instead of https://github.com/request/request (which uses querystring for posts). I can only assume that superagent does not use query string method for posting information, but I have not dug deep enough yet to determine what is actually different. All I know is my JSON object is not altered from a GET to a PUT with superagent instead of request.js altering the JSON object by converting all numerics and booleans to strings. This nugget may prove helpful for anyone else running into this same issue with request.js.

The plugin created by @xpepermint worked for me and will likely work for others who need it. I agree with the fact that I don't think qs should do this for you and people should opt in to this type coercion if they need it. For example, I have some string fields which are actually going to hold values such as "1456" some of the time. I know that the plugin will convert this to a number, but I also know that my data store will save it as a string - so this works for me. If my data store didn't do this, I wouldn't mind calling toString() on a few fields before saving.

I'm happy to consider a way for users to opt in to automatic coercion - but I wouldn't want that by default.

I'm closing this for now, but am happy to reopen it if it warrants further discussion or work.

Could there not be some way of indicating coercion on a case-by-case basis, using a special character maybe? For example, we could have http://example.com/api/todos/?completed@=true&id@=1, and that would indicate that we want 'completed' and 'id' to be coerced, so we'd get { completed: true, id: 1 }... but if we just did completed=true&id@=1, then only id would be coerced, so we'd get { completed: 'true', id: 1 }.

We could have the special character either in the property name, or in the value itself, and we could put it either at the beginning or the end (although I like it right up next to the = sign).

[ update ]

I've got the above implementation working here, at least for parsing... we might want to add some enforcement for the character(s) that can be chose to indicate coercion, which I currently have defaulting to @;

That seems very inelegant - you'd need to support that special character in your server code as well as your client code.

In other words, there's no standard for using some kind of special character, so I think it'd be a very bad idea to codify that into qs.

Can't say I agree... is the []= syntax standard? It's commonly used, but I don't think it's any kind of standard, as far as I can see, yet qs supports it... plus it lets you pick a non-standard delimiter... which you'd need to support in your server code as well as your client code.

Besides, choosing a character isn't a huge ask, especially when the default is provided. People can just construct their urls in the client using @=, and then the server would coerce just those values. If they want to always coerce, they can easily set the coercion character to an empty string.

Anyways, it's your tool. I might go ahead and spin something off.

is the []= syntax standard? It's commonly used, but I don't think it's any kind of standard, as far as I can see, yet qs supports it.

It is specified in http://www.w3.org/TR/html-json-forms/ , yes. It was working being done to create a standard, through that work has been stopped now.

I was here with the same issue and I just solved it like:

const myqs = {
  stringify: obj=>encodeURI(JSON.stringify(obj)),
  parse: json=>JSON.parse(decodeURI(json||'{}'))
}

Adding my +1 to this feature request and linking it to the previous discussion in the original package that qs is based on.

There's definitely a lot of support and use cases for the feature, and I think making it opt-in would be great with a coerce option in qs.parse.

@mmcgahan You can already provide your own encoder and decoder, deciding for yourself when "true" means a boolean versus a string - I'm not sure how that decision could ever be codified inside qs or node-querystring in a way that makes sense.

ah, decoder is perfect for me. I didn't see the docs in the stringify section - thanks!

Hi, I'm actually disappointed that this issue is closed. I see so many issues raising the same boolean problem.

Heres my suggestion: its ok to not have this behavior by default, we can just have { allowBooleans: true } options to qs.parse() just like many other existing options out there..

@izelnakri again, you can already do this yourself: #91 (comment)

The issue remains that "true" might be the string "true" or a boolean true; there's no way to know for sure, so since that knowledge is highly specific to your application and your codebase, the only solution that makes any sense is for that knowledge to live in your application and your codebase - via a custom encoder/decoder.

@ljharb Seems like this issue still gets the occasional +1 because there is demand in the community for a basic level of decoding built into this package on an opt-in basis.

The "true" string case comes up so often, and in most cases is so obviously something that you'd want coerced, that people want a common package that will do such basic decoding. I'd bet that 99% of the time, coercing "true" to a boolean would be preferable. I'd make a similar bet about coercing "123" to an integer. Given that these are such common use cases, I think it makes at least some sense that there might be an opt-in option in the common querystring builder that we all use. If everyone (or 99% of everyone) is implementing these basic coercion cases in their own application-relevant decoders, it would save a lot of lines of code in the long run.

I'm a little more educated now re: standards, and I understand the reluctance to set a de facto standard by implementing it in this library. Maybe a separate package that provides a basic extensible decoder is a better solution (I haven't looked for one recently). However, I still think that including it as an opt-in option (i.e. a non-standard option) in this package as @izelnakri suggests, avoids this problem, because it's not the default functionality: It's just a very lightweight enhancement that allows basic encoding/decoding if you want to use it.

That's my two cents, but obviously I'm way past this issue by now.

It's pretty trivial to please most use cases by using the decoder option. Though I appreciate this is not a robust solution. There would need to be some special syntax to mark non-string type values in the querystring itself.

qs.parse(data, {
    decoder(value) {
        if (/^(\d+|\d*\.\d+)$/.test(value)) {
            return parseFloat(value);
        }

        let keywords = {
            true: true,
            false: false,
            null: null,
            undefined: undefined,
        };
        if (value in keywords) {
            return keywords[value];
        }

        return value;
    }
});

Also with our decoder code, by using in, a value of β€œtoString” will return a function. Use Object.prototype.hasOwnProperty.call instead.

qs.parse(data, {
decoder(value) {
if (/^(\d+|\d*.\d+)$/.test(value)) {
return parseFloat(value);
}

    let keywords = {
        true: true,
        false: false,
        null: null,
        undefined: undefined,
    };
    if (value in keywords) {
        return keywords[value];
    }

    return value;
}

});

this doesn't seem to work for me.

Why not delimit strings like JSON does?

EDIT: Removed the "just" - sorry! I hate when users do that to me.

@waynebloss because HTTP query strings aren't encoded like JSON is, so that would break a bunch of serverside implementations.

Thanks @ljharb, I was thinking that anyone using this library would be doing so for its ability to send a deep graph instead of a flat structure and therefore, you'd have to use the same parser protocol on the server anyway.

meet the same problem.
why, don't you just encode and decode quotation mark " ?
if 'a=true', decode as { a: true }
if 'a="true"' (url encoded version will be 'a=%22true%22'), decode as {a:"true"}
this is a standard url encoding

@nursultan156 in no way is that a "standard", that's the problem. It's certainly one reasonable solution, but the actual standard is that a=true means a string true, and a="true" means a string "true", and I'm quite sure there are plenty of websites that rely on this.

@ljharb what if, while encoding, you wold just wrap all values (except numbers and booleans) in extra quotation marks, then while decoding unwrap them, but those which doesn't have extra quotation marks try to parse as int or boolean.

@nursultan156 there's no amount of quotes that would work, because a billion quotes would still be parsed as normal parts of the string.

@ljharb as an example:
foo: { bar: 'baz' } will be encoded as foo[bar]="baz",
foo: { bar: 123 } will be encoded as foo[bar]=123,
foo: { bar: '123' } will be encoded as foo[bar]="123",
foo: { bar: true } will be encoded as foo[bar]=true,
foo: { bar: 'true' } will be encoded as foo[bar]="true",
foo: { bar: '"baz"' } will be encoded as foo[bar]=""baz"",
foo: { bar: '""billions of marks""' } will be encoded as foo[bar]="""billions of marks"""
give example that wouldn't work

@nursultan156 i understand what you're suggesting, but that's not what "every webserver" would expect, and that's not a standard.

@ljharb it would be good adding it as option

@nursultan156 it already exists; use a custom encoder.

Building query strings with deeply nested values delimited by square brackets is also a completely non-standard output that all web servers won't expect, so I'm not sure why the insistence on limiting developers to existing standards.

What's a standard webserver going to do with a key like "[a][b][c][d]" anyway? For that key to be useful, you're going to use something that understands that key on the webserver too so standards have quite gone out the window by this point!

The suggested option would parse fine by all webservers everywhere.

Furthermore, when I'm building apps they're just for my server and not all web servers everywhere. So if some other middle/caching/proxy/load-balancing server parsed my query value as "hello" instead of hello, everything would still work and it wouldn't be the end of the world.

Not being able to parse things correctly is much more annoying than being "non-standard" IMO.

In the end, it's up to whatever @ljharb wants to do since he's the sheriff around these parts.

You can always fork and advertise the advantages of your lib!

@waynebloss It's definitely standard, and something all Rails and PHP and node express servers would expect, which is a significant portion of the entire internet.

I totally agree that as long as you're pairing your server with your client config then everything is fine! You can already do everything asked for in this proposal by providing a custom encoder, so I'm not sure why you'd need to fork :-)

@ljharb I would prefer a simple portable, configurable option over a chunk of code that I have to copy to every project. The only other reason I say to fork is because that's a common refrain in conversations such as these - the option is always there.

Perhaps wrapping qs in another package (or putting your custom encoder in another package) would be a more agreeable compromise for most in this case.

It's pretty funny though, that the original poster here says that they use express-query-boolean which is already a package for this kind of logic that they feel is too burdensome to include in their projects.

I can't say that I disagree very much. The package overload is real.

found the solution that helped me, instead of passing this:
foo: { bar: 1000 } will be encoded as foo[bar]=1000
i pass this:
foo: JSON.stringify({ bar: 1000 }) will be encoded as foo="{"bar":1000}"

Generally agree with @ljharb here, maybe a 1 line note in the main readme.md to summarise how types are cast would help. Also accept the point that pasting the decoder function around is a pain and not that pretty. I would export a locally augmented version of qs in your project in this case rather than a new wrapper npm package (if that's what was meant by @waynebloss).

SOLUTION

Extending the default qs.decoder() method with logic to test for Int and Bool:

qs.parse(request.querystring, {
      decoder(str, decoder, charset) {
            const strWithoutPlus = str.replace(/\+/g, ' ');
            if (charset === 'iso-8859-1') {
              // unescape never throws, no try...catch needed:
              return strWithoutPlus.replace(/%[0-9a-f]{2}/gi, unescape);
            }

            if (/^(\d+|\d*\.\d+)$/.test(str)) {
              return parseFloat(str)
            }

            const keywords = {
              true: true,
              false: false,
              null: null,
              undefined,
            }
            if (str in keywords) {
              return keywords[str]
            }

            // utf-8
            try {
              return decodeURIComponent(strWithoutPlus);
            } catch (e) {
              return strWithoutPlus;
            }
          }
})

With @crobinson42 solution what would you suggest to do if you really want to have a param encoded as a string even if it could be decoded as a number?

For example you have an input field where a user can search for something. If the user wants to search for just a number (e.g. an ID) it will be parsed as a number but your input and filter logic is expecting a string. It will be really painful to handle this edge cases everywhere.

with all this conversation above I still have no idea how to achieve this.
It would be nice to be able to do something like
let search = qs.parse(window.location.search, { arrayFormat: "bracket", integer: true })

@AndreiMotinga based off of #91 (comment):

qs.parse(window.location.search, { ignoreQueryPrefix: true, arrayFormat: 'bracket', decoder(str, decoder, charset) {
  const strWithoutPlus = str.replace(/\+/g, ' ');
  if (charset === 'iso-8859-1') {
  // unescape never throws, no try...catch needed:
    return strWithoutPlus.replace(/%[0-9a-f]{2}/gi, unescape);
  }

  if (/^(\d+|\d*\.\d+)$/.test(str)) {
    return parseFloat(str)
  }

  const keywords = {
    true: true,
    false: false,
    null: null,
    undefined,
  };
  if (str in keywords) {
    return keywords[str]
  }

  // utf-8
  try {
    return decodeURIComponent(strWithoutPlus);
  } catch (e) {
    return strWithoutPlus;
  }
} });

Here is something a bit more configurable with additional options such as ignoreEmptyString

function newDecoder (parseNumbers = true, parseBool = true, ignoreNull = true, ignoreEmptyString = true) {
  return function decoder (str, decoder, charset) {
    const strWithoutPlus = str.replace(/\+/g, ' ')
    if (charset === 'iso-8859-1') {
      // unescape never throws, no try...catch needed:
      return strWithoutPlus.replace(/%[0-9a-f]{2}/gi, unescape)
    }

    if (parseNumbers && /^(\d+|\d*\.\d+)$/.test(str)) {
      return parseFloat(str)
    }

    if (ignoreEmptyString && str.length === 0) {
      return
    }

    const keywords = {
      null: ignoreNull ? undefined : null,
      undefined
    }

    if (str in keywords) {
      return keywords[str]
    }

    const boolKeywords = {
      true: true,
      false: false
    }

    if (parseBool && str in boolKeywords) {
      return boolKeywords[str]
    }

    try {
      return decodeURIComponent(strWithoutPlus)
    } catch (e) {
      return strWithoutPlus
    }
  }
}

When i use your solution with comma: true, it won't work.

TypeError: val.indexOf is not a function 
at parseQueryStringValues (node_modules/qs/lib/parse.js:86:41)

When i edit this line and add typeof val !== 'boolean' as validation then it works.

@sfaujour a custom decoder is required to only return a string.

isn't it good to have an option as parseBool and parseNumber just like query-string have?

That package's docs for parseBool say "if it's a boolean", and "if it's a number", but don't explicitly describe how they determine that. In some server frameworks, a boolean is yes/no/true/TRUE/false/FALSE/on/off/1/0. In others, it may just be true/false. In some, it might be cierto/falso, perhaps.

In other words, having these options that arbitrarily encode one person's opinion about "how to determine them" is flawed at best, and I'd prefer not to have them at all. You can, however, use a custom decoder to decide for yourself what "it's a boolean" means, and then qs wouldn't run the likely risk of having nonsensical options.

@ljharb
on/off/1/0/yes/no are basically custom type of values which represent boolean and not converting to boolean is reasonable. And who ever is using then is most likely is checking their validity based on their type, which is not Boolean. So no argue on converting on/off/1/0/yes/no. But what would be the use of true/TRUE/false/FALSE as string other than representing a boolean value?

The thing is, this type of "auto-coercion" code must be written to iterate over data and interpret certain fields as scalar/primitive values (numbers, booleans, etc). That code can be written in two places:

  1. The qs library can implement it - this would be a breaking change with guaranteed adverse affects. You have to worry about "how do I turn this off", which results in this library having more API, which means more maintenance/documentation, which leads to more "why is this happening" types of bugs, and a whole host of other potential issues.
  2. You can implement it yourself on a per-project basis - this give you full control over which values should be coerced, the qs library stays leaner and more predictable, and everybody is happy - even the developer who had to spend 15 minutes googling and copying code into their project.

Option 2 is a clear winner hear. I propose that the docs should link to a code tried-and-true coercion snippet for easy copy/paste and lay this issue to rest.

Happy to accept PRs to improve the docs.

@AndreiMotinga based off of #91 (comment):

qs.parse(window.location.search, { ignoreQueryPrefix: true, arrayFormat: 'bracket', decoder(str, decoder, charset) {
  const strWithoutPlus = str.replace(/\+/g, ' ');
  if (charset === 'iso-8859-1') {
  // unescape never throws, no try...catch needed:
    return strWithoutPlus.replace(/%[0-9a-f]{2}/gi, unescape);
  }

  if (/^(\d+|\d*\.\d+)$/.test(str)) {
    return parseFloat(str)
  }

  const keywords = {
    true: true,
    false: false,
    null: null,
    undefined,
  };
  if (str in keywords) {
    return keywords[str]
  }

  // utf-8
  try {
    return decodeURIComponent(strWithoutPlus);
  } catch (e) {
    return strWithoutPlus;
  }
} });

Note that this check for floats (/^(\d+|\d*\.\d+)$/.test(str)) misses negative values. I'm working with lat/long and it bit me. I think this regex will work: /^[+-]?\d+(\.\d+)?$/

Oh! I thank it's very helpful~ πŸ‘

I didn't realize this super simple and native solution, thanks!

I was here with the same issue and I just solved it like:

const myqs = {
  stringify: obj=>encodeURI(JSON.stringify(obj)),
  parse: json=>JSON.parse(decodeURI(json||'{}'))
}

It's work for me

qs.parse(request.querystring, {
      decoder(str: string, defaultDecoder: qs.defaultDecoder, charset: string, type: 'key' | 'value') {
	if (type === 'value' && /^(?:-(?:[1-9](?:\d{0,2}(?:,\d{3})+|\d*))|(?:0|(?:[1-9](?:\d{0,2}(?:,\d{3})+|\d*))))(?:.\d+|)$/.test(str)) {
		return parseFloat(str);
	}

	const keywords = {
		true: true,
		false: false,
		null: null,
		undefined: undefined,
	};
	if (type === 'value' && str in keywords) {
		return keywords[str];
	}

	return defaultDecoder(str, defaultDecoder, charset);
    }, 
})