huin/goupnp

Making charset dependency optional

Closed this issue · 6 comments

Hey, thanks for the work!

I was looking to use this client in a memory constrained environment, and was wondering whether it'd be possible to make the charset dependency optional by assuming a utf-8 encoding. If so, I can do the work to make a PR for it.

huin commented

That's a good question... The fiddly bit is that the way I wrote the API (so many things I'd change in hindsight!) means that we'd be potentially breaking existing compatibility for existing users.

Do you have any thoughts of how we could either make this change without breaking existing users, or, failing that, how we might minimise the impact and steps needed for them to fix it for themselves?

Hm, well if go's package had good versioning, I would say that cutting a new version without the dependency would leave it still working for existing users, but I'm not sure if the versioning is good enough for that.

I think one option for minimizing impact would be exposing some variable for the charset reader function, and allowing users to inject their own package at Init, which would allow support for old behavior, but default to new behavior. It wasn't clear to me what other charsets UPnP uses besides UTF-8 from briefly glancing at the spec, so what old behavior would be broken?

huin commented

https://github.com/huin/goupnp/blob/master/goupnp.go#L144 this is where the charset package is used. If we do as you suggest, then I think the implication would be that any XML describing the devices/services present could fail to decode properly if they specify a character set/encoding other than UTF-8 (perhaps ASCII would work as well).

It's an unknown. I suspect that this would typically be okay, but I don't have data for which encodings are typically used. UPnP servers can often vary quite widely in properties. Existing users of this library might find that their code suddenly breaks when accessing a UPnP server with an incompatible encoding.

I think I'm leaning towards agreeing to your suggestion, but would require (to be good citizens to other users of the library:

  • An Init as you suggest, to install a charset provider. As the code stands, this would provide the default value for decoder.CharsetReader (see https://golang.org/pkg/encoding/xml/#Decoder). To give us more space to maneuverer the API in future (in case we need to inject other things), suggest the following as how it would be called by people wanting to retain the existing behaviour:

    import "golang.org/x/net/html/charset"
    // ...
    goupnp.Init(goupnp.CharsetReader(charset.NewReaderLabel))
  • Migration instructions in the README.md for people encountering XML parsing errors. This would tell people to basically do the above in either an func init(), or otherwise in the startup initialization of their process.

I agree that certainly we wouldn't want to break anyone accidentally. I think that approach sounds good, should I implement it and submit a PR?

huin commented

Yep, sounds good. I think there's possibility of breaking peoples' clients here, I don't think that's avoidable without forking the library. But hopefully the chances are low (i.e hopefully most UPnP servers don't use non-ASCII/UTF-8), and we can provide fix instructions for their code to retain that compatibility.

Thanks! This is now resolved, I'll close it but if this breaks anything feel free to reopen it.