Content: - What does XmlDiff do - How to run XmlDiff - XmlDiff's algorithm - How to interpret the results XmlDiff's output file xmlDiff.xml - xmlDiff.xml example - mandatoryTags.xml example ________________________________________________________________________________ What does XmlDiff do: Output: xmlDiff.xml Input: old.xml new.xml mandatoryTags.xml - mandatoryTags.xml is optional, and specifies the tags that are mandatory and therefor should be kept in the output xml file even if not changed. - xmlDiff.xml contains the difference between the first two input xml files and the mandatory tags specified in the third input xml file (eg: mandatoryTags.xml) if given. - the xml files must contain at least one empty tag(the root tag) to pass the org.w3c.dom validations used by XmlDiff. Also attributes must have unique names within the same tag etc. on short they must be valid xml files. ________________________________________________________________________________ How to run XmlDiff: - Give your user execution right over the jar file chmod u+x xmlDiff.jar - Run the jar in release or debug mode (mandatoryTags.xml is optional): Release mode: java -jar xmlDiff.jar old.xml new.xml mandatoryTags.xml Debug mode: java -DxmlDiff.isDebugBuild=true -jar xmlDiff.jar old.xml new.xml mandatoryTags.xml ________________________________________________________________________________ XmlDiff's algorithm: Compare each tag and its attributes from the first xml file with each tag with the same name, and its attributes, from the second xml file (exception: the first/root tags, the contents of which are compared even if their names have changed). In this comparison is build the list with the matched tags(mandatory, changed and not changed) in a descending order of their content's matching percentage. The descending order is used to ensure that the highest matching percentages are matched/chosen first. Then half matches are marked as new or deleted, and the remaining tags will be deleted. A tag is kept in the output xml file only if: - it or its child tags got changed, and will contain only the changes and their parents - it is mandatory, in which case all its child tags will be kept even if they did not changed nor are mandatory - it contains mandatory child tags, in which case only its mandatory child tags and their parents will be kept even if no other change was done - is new/deleted, in which case all its child tags are kept as well, but nothing is marked as mandatory even if they are, as anyway its whole content will be put in xmlDiff.xml ________________________________________________________________________________ How to interpret the results of XmlDiff's output file xmlDiff.xml: Changes are marked with the following values: - S = same/no change (only in debug mode) - C = changed - N = new/added - D = deleted Other marks used in xmlDiff.xml for specifying why the tag was kept in the output file: - mod = says the type of MODification received by the tag or its content - mod_attrName = says how the attribute called "attrName" got changed - mand = says if the current tag(NOT its kids) is MANDatory - mod_val = says if the value of a tag got replaced or replaces the previous child tags of its tag. When it doesn't replaces nor is replaced by other tags, it's change is reflected in the "mod" mark described above. Valid values are D(deleted) and N(new). ________________________________________________________________________________ xmlDiff.xml example: <?xml version="1.0" encoding="UTF-8"?> <book2 mod="C" myBook="yess" mod_myBook="C"> <person mod="C" h="5" mod_h="C"> <first mod="C" mand="y" ghhh="hey" mod_ghhh="D" hi="6" mod_hi="D" id="789" mod_id="D"> Bill </first> <last mod="C"> Paii </last> <first mod="N" mand="y"> William </first> </person> <animal mod="D" hi="4"> <first j="fg"> Fifiaaaaa </first> </animal> <person mod="C"> <first mand="y"> Bill </first> <last mod="D"> Gates </last> <age mod="C" h="6" mod_h="N"> 22 </age> </person> <animal mod="D"> <name> Haha </name> </animal> <person mod="C"> Heidi (mod_val="N") <first mod="D" mand="y"> Steve </first> <last mod="D"> Jobs </last> <age mod="D"> 40 </age> </person> <person mod="C"> Elena (mod_val="D") <first mod="N" mand="y"> Elena </first> <last mod="N"> Florea </last> </person> <person mod="C"> <first mand="y"> Alina </first> <first mod="N" mand="y"> Ioana </first> <age mod="N"> 28 </age> <kid mod="C"> <name mod="C" o="89" mod_o="D" u="hihi" mod_u="D"> Alin </name> <expertise mod="N"> robotics </expertise> <age mod="D"> 2 </age> </kid> </person> <person mod="D"> </person> <animals mod="N" h="5"> <animal h="4"> <name> Fifi </name> </animal> <animal> <name> Fifi2 </name> </animal> </animals> <animals mod="N"> <animal> <name> Fifi2 </name> </animal> </animals> <animals mod="N"></animals> <lina mod="C"> </lina> </book2> ________________________________________________________________________________ mandatoryTags.xml examples: - Example 1: Here there is only one mandatory tag named "first". The root tag is ignored so it can be named anything you wish. <mandatoryTags> <first> </first> </mandatoryTags> - Example 2: Here there are no mandatory tags only the root tag which is ignored. At least a root tag must be specified for org.w3c.dom to be able to parse a valid xml file. If there are no mandatory tags then just don't pass a third argument to xmlDiff.jar, as mandatoryTags.xml is an optional argument! <mandatoryTags> </mandatoryTags> OR <?xml version="1.0" encoding="UTF-8"?> <book2> </book2>