Allow for fuzzy patching
goldsam opened this issue · 15 comments
In some applications it is useful to support "fuzzy" patching where it is acceptable for some patch operations to fail. Ideally, such an operation would be able to report all failed patch operations.
@goldsam, can you please provide an example? From what I can tell, the RFCs stipulate that erroneous operations SHOULD terminate processing (see RFC 6902 section 5), whereas the HTTP JSON Patch (see RFC 5789 section 2.2) recommends emitting 409 Conflict responses for similar situations.
While these indicate that we can support what you refer to, there's the other consideration which is: patch ops are applied in order, and are considered atomic (i.e. a failure in any of them should not modify the target document). Under what situations would "ignoring" (or just logging) errors be appropriate or desirable?
My specific use case is an implementation of Neil Fraser's Differential Synchronization algorithm which aims to synchronize a document among a number of remote clients. In short, the idea is take frequent diffs between an actively edited document and a recent "shadow" copy of a shared/synchronized document. Synchronization is achieved through the exchange of versioned diffs. The original work was focused on plain text, but the idea is easy to extend to JSON documents. This algorithm is reliant on a fuzzy patch algorithm.
An implementation is free to discard documents which do not "validate" after applying the fuzzy patch; that is, it is not the patch algorithm's responsibility to ensure correctness or consistency in this application. The idea is that an "fuzzy" patch would make a "best effort" at patching, where the meaning of "best effort" is up for debate.
Implementing support would required extending the core API. I think this might be achievable while maintaining backwards API compatibility. Would you be open to such changes?
Picking this up, this would essentially be providing a "fuzzy processor" instead of the default one, which suggests this could easily be done outside of the mainline codebase (which I feel should accurately reflect the RFC). That being said, any refactoring or convenience that can make this easier is definitely on the table...
@goldsam Still interested? Would appreciate a followup, otherwise I believe we can close this @vishwakarma
A custom processor would be a solution, but this currently can't be done outside of the main codebase because the only static method in JsonPatch
that accepts a JsonPatchProcessor
is private.
Ideally, it would be nice to implement this by leveraging the current behavior of class InPlaceApplyProcessor
. Its seems like I simply need to overriding the void error(Operation forOp, String message)
method. This would imply making the InPlaceApplyProcessor
class a part of the public API of the library.
What do you think about the following proposed changes (in decreasing order of need/desire)?:
- Change class
JsonPatch
by making the methodstatic void process(JsonNode patch, JsonPatchProcessor processor, EnumSet<CompatibilityFlags> flags)
public. - Change class
InPlaceApplyProcessor
to be public, or lift core behavior to abstract base class that is public. - Make the method
void error(Operation forOp, String message)
of this class (or lifted abstract base) public.
I would be happy to make a PR if you are open to such changes.
Eugh. You're absolutely correct, my only concern is that once these interfaces are made public it'll be hard to modify them (as I did in #95), and they'll need some serious documentation.
For my money, the "right" way to go is the first two out of your three wishlist items, and if you want to take a stab at it, I'm all for it!
Without exposing the void error(Operation forOp, String message)
method publicly, there is not a great deal of benefit (that I can see) in exposing class InPlaceApplyProcessor
as a part of the public API. At least, I can't think of any other use cases for adding this class to the public API.
That being said, I think it makes more sense to provide a simple extension point for customizing error handling beyond the default RFC-compliant behavior. To this end, I am proposing a simple strategy interface in #96. This maintains a smaller API surface area as compared to exposing class InPlaceApplyProcessor
and the static void process(JsonNode patch, JsonPatchProcessor processor, EnumSet<CompatibilityFlags> flags)
method of class JsonPatch
.
I'm going back on what I had previously suggested, but I think this is more reasonable. What are your thoughts?
It's a possibility, but I'm not sure it's actually the right strategy. To start with, the API is somewhat limited in what it can offer (e.g. it's possible to write a "ignore test-op failures" strategy, but harder to only do so on specific nodes; it needs more context for error handling). Second, I'm hoping #95 gets merged, in which case I expect a lot of merge conflicts. I actually forked that branch locally and started making things public and adding JavaDocs, and it's very similar to your PR :-)
Overall exposing error handling semantics seems to necessitate even more work than previously. I'd actually prefer to just expose the processor API publicly, especially given the much cleaner API with the new JSON Pointer implementation in #95. @vishwakarma @LouizFC you have any thoughts on this?
Sorry for the delay @holograph .
The changes presented by @goldsam looks good, but I also think that it is not the right way to handle it.
I am also a bit reluctant to expose the Processor API, because it is the "core" of the library. While it provides users with much more freedom and flexibility, it limit us on the kind of change we can make (for example, if exposed earlier, #95 would not be possible without some internal delegation).
I will think about it a bit more and see if I can provide a more useful answer
Another idea - The set of JSON Patch operations is finite and small. You could introduce an JsonPatchErrorHandler
API with a method for each operation where all arguments would be elements of the current public API. This would be flexible while not exposing any existing elements of the current internal implementation. Maybe something like this:
public interface JsonPatchErrorHandler {
void handleRemoveError(List<String> path, String errorMessage);
void handleReplaceError(List<String> path, JsonNode value, String errorMessage);
void handlAddError(List<String> path, JsonNode value, String errorMessage);
void handleMoveError(List<String> fromPath, List<String> toPath, String errorMessage);
void handleCopyError(List<String> fromPath, List<String> toPath, String errorMessage);
void handleTestError(List<String> path, JsonNode value, String errorMessage);
}
@goldsam this introduces some boilerplate to users that want to implement it (but again, this would be a "power user" feature), but seems usable.
@holograph what do you think? I think the best way to handle this would be:
- Wait for #95 to be merged.
- Make JsonPatch instantiable (and immutable, maybe with a builder or similar mechanism), so we do not need to polute JsonPatch#apply
Another idea to the pile:
- Maintain the first API (only one method for error handling)
- Rename Diff.java to
Patch
, make it public, but maintaining everything else internal / package-private. - Bundle the patch information into the
Patch
object (only when a error occurs), and pass it to the handler, making the signature be something likehandlePatchError(Operation, Patch, Message)
- Make a read-only public API that does not expose the new
JsonPointer
(just yet), onlyString
s.
I've got one more wrench to throw into the mix:
When building an application that modifies a document explicitly (such as mine does), such modifications could be expressed as patch operations. This use case would reduce the complexity of building a patch to constant time without the need for searching or comparing (keep in mind my use case is a real-time collaborative editor). However, it would require API support for directly expressing those modifications. This would be an argument for exposing JsonPatchProcessor
as part of the public API.
Any new thoughts on this?