Two examples - shouldn't it sanitize this?

Question

Two examples - shouldn't it sanitize this?

wisechoice opened this issue 10 years ago · 2 comments

When we run these examples through the jsonsanitizer this is what happens. I was under the impression this sanitizer should eliminate these types of XSS attacks? Or do you presume the JSON is broken down into key/value pairs and input validated/output encoded on a field by field basis? That would not scale very well for performance so was hoping jsonsanitizer would work.

public void testSVGAttack() throws Exception {
String json = "{test: "<svg/onload=alert(/XSS Owned/)>"}";
String clean = JsonSanitizer.sanitize(json);
System.out.println(clean);
}

input: {"test": "<svg/onload=alert(/XSS Owned/)>"}

output: {"test": "<svg/onload=alert(/XSS Owned/)>"}

and then there is this........

String json = "{"test": "MDM%3c%73%43%72%49%70%54%20%74%59%70%45%3d%74%45%78%54%2f

%76%42%73%43%72%49%70%54%3e%4d%73%67%42%6f%78%28%31%32%38%31%35%29%3c%2f

%73%43%72%49%70%54%3e"}";

input === output

unencoded that is

{"test": "MDM<sCrIpT tYpE=tExT/vBsCrIpT>MsgBox(12815)</sCrIpT>"}

Answer 1 · 2015-07-03T15:15:20.000Z

https://github.com/OWASP/json-sanitizer#security says

This library only ensures that the JSON string → Javascript object
phase has no side effects and resolves no free variables, and cannot
control how other client side code later interprets the resulting
Javascript object. So if client-side code takes a part of the parsed
data that is controlled by an attacker and passes it back through a
powerful interpreter like eval or innerHTML then that client-side
code might suffer unintended side-effects.

var myValue = eval(sanitizedJsonString);  // safe
var myEmbeddedValue = eval(myValue.foo);  // possibly unsafe

Additionally, sanitizing JSON cannot protect an application from
Confused Deputy attacks

var myValue = JSON.parse(sanitizedJsonString);
addToAdminstratorsGroup(myValue.propertyFromUntrustedSource);

Answer 2 · 2015-07-03T15:19:11.000Z

Without schema information about the meaning and provenance of embedded strings, there's no way to filter out embedded payloads without severely restricting the places where the sanitizer can be used so we don't even try. For example, generating HTML from a trusted server-side template and sending it to the client.

Extending the sanitizer to take into account schema info could be worthwhile but I haven't done that yet.