/Ditto

Generic Metamorphic/Substitution Engine

Primary LanguageC++

Ditto

Generic Metamorphic/Substitution Engine

Not even in Alpha

This project is still under developpment, and not all of the features described may work.
No binairies will be provided until it reaches a stable-ish point where you can safely assume that any resulting binary will reproduce the exact behavior of the original, unless unsafe options are explicitely used.
The design and features can still change at any given time.

What it is, what it does

It takes an executable file in input, tries do rewrite and reorder it on a higher level, then outputs a file that produces the same observable behavior, but with a content that might look completely different in, for example, a hex editor.
It does not need to be integrated in a project to work, Ditto is a standalone executable. But if embedded in a self-replicating program, it is effectively possible to create a different 'generation' on every iteration.

(Planned) features

Simple substitutions, adding or removing no-ops, semi-randomization of the metadata, and reordering of small blocks of instructions are the fastest and simplest operations.
Reordering of the jump/call flow, register shuffling, ROP substitutions, merging, reordering and obfuscation of data are all more complex, and they require a complete analysis of the code with the use of relocations, as any error in the analysis could completely break the output.
It is also possible to encrypt sections and generate a polymorphic decryption rountine that replaces the entry point, but this is only an option and not recommended.
The combination of those options should be able to 'randomize' most parts of a binary.

How does it work ?

Ditto will make use of relocations to assist the analysis, and will even require a .reloc section for the more advanced options.
A a built-in disassembler with support for most of the x86 instruction set, including x87 and the various extensions (SSE, SSE2, SSE2, MMX, etc) is used to first separate code into instructions and mark the known references to data.
Optional analysis passes then build data structures containing the map of branches with their destinations, cross-references, jump flow, higher-level data structures, and known independent blocks of code.
Then the optional transforms will work on the higher-level representations of the binary successively.
Finally the virtual image and then the raw image are rebuilt from the instructions and metadata and the result is written to the output file.

But why ?

I wrote this mostly to learn more, but self-modifying code and metamorphism are interesting concepts.