A multipart graphql request is sensitive to the default charset of the jvm
hibnico opened this issue · 0 comments
Describe the bug
- having a GraphQL request which contains characters outside of the ASCII ones, encoded in UTF-8 for instance
- this request being answered by a server which is configured to have as a default charset "US-ASCII"
- it results in the data received in the business logic as badly encoded.
For instance, "testééé" is received as "test������"
Expected behavior
The encoding of the multipart request should be respected.
Additional context
I have found the culprit:
https://github.com/graphql-java-kickstart/graphql-java-servlet/blob/master/graphql-java-servlet/src/main/java/graphql/kickstart/servlet/GraphQLMultipartInvocationInputParser.java#L169
The function read of GraphQLMultipartInvocationInputParser is sensitive to the system charset.
Then what should be the correct charset to use ? I have searched hard to find an answer for a multipart, I haven't find a clear one in the specifications. As far as I can tell, the charset of the request should be used, and if none, it should be the default HTTP charset, ISO-8859-1. It is consistent with what I have find the code of tomcat:
- the parsing of the multipart: https://github.com/apache/tomcat/blob/main/java/org/apache/catalina/connector/Request.java#L2891
- the charset of the request: https://github.com/apache/tomcat/blob/main/java/org/apache/catalina/connector/Request.java#L974
- the default charset: https://github.com/apache/tomcat/blob/main/java/org/apache/coyote/Constants.java#L30
It also tells me that the stream of the request shouldn't passed as argument to the various jackson parser in the code of GraphQLMultipartInvocationInputParser. The JSON specification expects only UTF-* encoding. The stream should be parsed as a string with the charset from the request, then send to the json parser.
cf: https://github.com/FasterXML/jackson-core/blob/2.13/src/main/java/com/fasterxml/jackson/core/JsonFactory.java#L1070