dolphinsmalltalk/DolphinVM

Replace DolphinSureCrytoLib SHA1 implementation with calls to native Windows BCrypt (CNG) API

Closed this issue · 11 comments

Dolphin now has it's own SHA1 algorithm, implemented in DolphinSureCryptoLib.dll.
But SHA is also available standard in Windows, since Vista with the CNG Crypto API.
When moving to CNG, Dolphin's code size will be reduced.
It will also pave the way for migrating to SHA256, since SHA1 is not deemed secure anymore.

I've created a C++ wrapper class around the CNG API for SHA1 use by Dolphin only,
and routed the CryptoLib exported functions to it. It works! :)

But... I've found that Dolphin's current SHA1 hashed byte ordering is different from the BCrypt output.
Dolphin orders bytes in reverse per 4 bytes (per DWORD).
I checked a sample on the site www.sha1-online.com, and the BCrypt ordering seems to be right.
So this will be a fix too.

--- Move it to the VM?
Ideally, I would like to remove the whole DolphinSureCrypto.dll from the solution and move the small wrapper class to the VM. That would reduce complexity even more and also increase startup performance since the DLL is always loaded on startup.

But then, how to call the functionality?
There is still some room in the bytecodes to make new primitives for this.

As an alternative, I've created a singe dispatching primitive called "VMCall",
that dispatches calls to C(++) functions in the VM, using the first parameter as a function index.
The function index uses a lookup table with the compiled function addresses, so performance is OK.
With this solution an unlimited number of functions can be added to the VM,
without using up more precious bytecode space, to be used for more often called primitives.

--
So, @blairmcg, whaddaya think?
ZLib could be next...

I think replacing DolphinSureCrypto by making use of the various Windows crypto APIs makes good sense. It's day has passed. However, I don't see the need for a C++ wrapper class nor to expose primitives to invoke those APIs. One of Dolphin's original design objectives was that it should provide very complete and performant FFI capabilities, and it was also a design principle not to put anything in the VM other than machinery for executing Smalltalk code and general purpose support for external interfacing. It specifically isn't trying to be an abstraction layer away from the OS or its APIs. All of that is intended to be up in the Smalltalk code. I can't imagine there is anything about the bcrypt API that means it cannot be called as an ExternalLibrary .

As for the general mechanism to call functions in the VM, that isn't really needed either. Dolphin's own FFI calls can be used to invoke services in the VM that are exported from it as normal function exports. This should be used sparingly, since its rarely needed, but if you look at the .def files you'll see that there are a few examples in the normal VM DLL, and in the composite form "to go" stubs these statically bind the CRT and then export some of the CRT functions so that they can be called through the FFI in the same way as in the development system.

So in summary: Yes, removing the need for DolphinSureCrypto is a good idea. Doing this by implementing new functionality in the VM probably isn't the right way to do it. Sorry.

You're right that the wrapper class isn't strictly needed. It was kind of convenient because the CNG API is fairly wide, and the DolphinSure lib interface is very small. I'll go try and make a CNG external library in Dolphin. Didn't do that excercise yet... :)

(PS The VMCall option was introduced for other restructuring reasons, but more on that later.)

I'm more reacting to the suggestion of putting the crypto wrapper in the VM. The change to DolphinSureCrypto to use the Windows implementation seems worthwhile by itself, even if the longer term aim is to get rid of it altogether. Why not send a PR for that as a first increment?

Implementing large external APIs can be time consuming and error prone. You can either just add small parts by hand (which isn't always possible), or I generally create a type library using the header files and some editor macros and manual editing to convert them into well-annotated IDL. Then build it into a typelib, and import that into Dolphin with the Active-X component wizard. It will generate good wrappers for modules of functions, and associated structures, especially if the original IDL is rich enough. I have some example projects such as blairmcg/WinHttpServerTlb. Although there is a bit of start up cost in creating the project and adding the boilerplate IDL, once that is done you can add the functions and structures you need one by one, and iteratively add/enrich and regenerate.

Okay, I will create a PR for the current change in DolphinSureCrypto.
That will me my first real code addition to Dolphin then. Also a good excercise. :)
First, I need to create some regression tests for it.
Note that the byte order of generated hashes will be different (corrected),
conforming to BCrypt and f.e. this site: www.sha1-online.com.
Could that create a compatibility issue?

I am also underway in making it work from Smalltalk itself.
So with a new ExternalLibrary, without the C++ wrapper class.
I'll do that first, just because it's fun to my first work with the FFI. :)
(I do find that debugging options are limited with Windows system code being executed directly form the interpreter written in assembly).

The BCrypt API has 30 functions.
For SHA1 use by Dolphin, only 6 are needed, and they are called with limited options (only SHA1).
I'm happy to add them all by hand, which seems well possible.
After a quick browse I see only handles, bytes and integers, so data types already known to the FFI.
But I'm not planning to create tests for the full BCrypt functionality.
So, can we live with untested API calls?
Or, alternatively, live with a partialy implemented API, fit for current use?

Hazaa!
I'm happy to report that the ExternalLibrary version of BCrypt based SHA1 is also working.
So with just the functions implemented that are needed for Dolphin SHA1 usage.
It's not yet used by the class SecureHashAlgorithm, but that's just plumbing.

I'll go and make a PR for the C++ version first then.
But the rest of this Sunday is needed for the weektime job, alas... .
Had a good time anyways :).

Okay, I've updated the title of this issue to replace the custem SHA1 with Windows BCrypt, within the DolphinSureCryptoLib (not replacing yet).

For my commit in a few moments:
I found out that the DolphinSure DLL has a nonstandard hash output on purpose (reversed per 4 bytes). This was done to allow the Smalltalk side to convert the hash to a large integer per DWORD in stead of per byte. Slightly faster.

For clarity and compatibility, and since the method is not called often, I implemented the correct hash output in the DolphinSure DLL (unaltered BCrypt output) and changed the method SecureHashAlgorith>>finalHash to expect this when creating the large hash integer from the hash bytes.

For the solution to work, the VM and the Smalltalk side have to be updated together.
.
(That reminds me: An intergated repository would be nice indeed. Ill play with it).

Umm, okay, version 2 of the change is now ready for release now.
.
This implements SHA1 using Windows BCrypt in Dolphin in a new ExternalLibrary class: BCryptLibrary, replacing DolphinSureCryptoLib, and slightly modifying the class SecureHashAlgorithm.
In the VM, the DolphinSureCrypto DLL subproject can simply be removed from the solution because its no longer needed and nothing else depends on it.
This has been tested with a rebuild of the DPRO image, and running the regression tests.
.
Committing the VM change will probably break the CI because the crypto DLL is no longer part of the build. Some config changes need to be made, that I cannot do. (Also the Dolphin Windows setup package needs one file less..)
.
How to proceed now?
Cancel the existing pull requests, or still do them first?
Commit via "submodule" procedure, or simply do 2 separate commits?
(The Dolphin commit can be done first, without breaking anything.
The crypto DLL will still exist but will simply not be used.)

Its usually a good idea to move forward incrementally, and the existing one is ready to go if you just update it to the commit I made, so you may as well complete the existing one.

I don't think there are any config changes you cannot make, since the build is described by the AppVeyor.yml file now (a change I made to support the different build configurations for 7.0 and 7.1). I don't think it will be hard to update the setup to remove something, but that can be done independently.

Any path that preserves green CI builds is acceptable.

Thank 2x. I'll follow this path and that of your other comment.
I'll need to build up my Git knowledge first in a day or 2...
But indeed best to have that behind me. ;-)

Just reporting that I've been busy at work and haven't gotten around to this.
But per June 1st (last Friday) my IT department has been split into two, per my own request,
with me heading only one of the parts from now on. The part with the most devs of course. :)
So having some more private dev time during the week and in weekends should be possible now!

This issue was moved to dolphinsmalltalk/Dolphin#537