imazen/imageflow

C# Bindings/API

Closed this issue ยท 120 comments

I wanted to start the conversation on the C# Bindings/API

Primarily at this point that means designing the API surface.

Hi!

Two things:

  1. All flow_graph_* and flow_node_* API surfaces will be replaced by a single JSON api. This will cut the API surface down drastically.
  2. There's a basic ruby wrapper here: https://github.com/imazen/imageflow/tree/master/wrappers/ruby/lib

Here's the subset of the API that would be used by the C# bindings. (in ruby FFI syntax). Check imageflow.h for param names.

  • attach_function :flow_context_create, [], :pointer
  • attach_function :flow_context_destroy, [:pointer], :void
  • attach_function :flow_context_error_and_stacktrace, [:pointer, :pointer, :uint, :bool], :int64
  • attach_function :flow_context_has_error, [:pointer], :bool
  • attach_function :flow_context_error_reason, [:pointer], :int
  • attach_function :flow_context_clear_error, [:pointer], :void
  • attach_function :flow_io_create_for_file, [:pointer, :flow_io_mode, :string, :pointer], :pointer
  • attach_function :flow_io_create_from_memory, [:pointer, :flow_io_mode, :pointer, :uint64, :pointer, :pointer], :pointer
  • attach_function :flow_io_create_for_output_buffer, [:pointer, :pointer], :pointer
  • attach_function :flow_io_get_output_buffer, [:pointer, :pointer, :pointer, :pointer], :bool
  • attach_function :flow_job_add_io, [:pointer, :pointer, :pointer, :int32, :flow_direction], :bool
  • attach_function :flow_job_create, [:pointer], :pointer
  • attach_function :flow_job_destroy, [:pointer, :pointer], :bool
  • attach_function :flow_job_configure_recording, [:pointer, :pointer, :bool, :bool, :bool, :bool, :bool], :bool
  • attach_function :flow_job_execute, [:pointer, :pointer, :pointer], :bool, blocking: true
  • attach_function :flow_job_get_decoder_info, [:pointer, :pointer, :int32, :pointer], :pointer, blocking: true

Note that that the last two API functions will be refactored to use JSON instead of structures. This refactoring hasn't taken place yet.

Hopefully this helps provide a sense of the (limited) scope of work required. We have no documentation right now about the JSON schema that will be used, but hope to publish that soon.

FFI API docs are now online. The interface should be pretty stable, but there is a slight chance that memory ownership semantics will be forced to change as more of the core is ported to rust.

These are uploaded as part of CI, and should be updated daily.

https://s3-us-west-1.amazonaws.com/imageflow-nightlies/master/doc/imageflow_core/abi/index.html

Ideally we won't see FFI breakage, but rather spend more time refining the JSON message API, which should be much easier to debug.

@nathanaeljones trying to view the index gives me a 403

I fixed the link, sorry! Did some reorganization. There are also windows binaries of libimageflowrs.dll available, although the JSON message API only responds to "teapot" at the moment.

https://ci.appveyor.com/project/imazen/imageflow/build/1.0.307/job/6edn741gtjnouyk9/artifacts

Let me know if I can improve the clarity of the API docs. There's not much to document on imageflow_send_json yet, but those schemas should be coming this week.

@SamuelEnglard, will #81 be problematic for the C# API? I don't know that you'll need to use the error reporting API until we get to implementing custom I/O wrappers in C# for, say, Stream classes, at which point catch{} and serialize will be the pattern for all function delegates/pointers involved. I/O error reporting isn't great in general, so it kinda depends on where you want to set that bar.

#81 will be an issue in that there's just no way to not transfer the C string memory to the GC's control. The best we could do would be to have C# allocate the error message memory and we fill it.

New docs, small API changes, but much higher confidence level: https://s3-us-west-1.amazonaws.com/imageflow-nightlies/master/doc/imageflowrs/index.html

Ruby bindings have been updated and are passing all tests again. Over 60% of bugs were discrepancies between camelCasing and snake_casing.

I'd try to target as low as I can, so hopefully 1.3.

mms- commented

If you can use help testing the c# wrapper let me know!

BTW, I'm happy to write the P/Invokes if that would help. It's reasonably trivial for me - the non-trivial bits are tooling/packaging/CI/release/scripts due to the cognitive space and context switching required.

If you want, the low-level bindings (and perhaps JSON (de)serialization schema/classes) could even go in this repo, under bindings/dotnet, so that they can be tested against each release. (Can the .NET Core tooling fit in 100MB? I could install it on the docker containers.)

My thinking/hoping is to have the P/Invoke and JSON (de)serialization code generated by T4 or Scripty. That way as you make changes they are updated with no issue.

The Core tooling should, really depends on the size of our dependancies

P/Invoke is easier by hand. I've invested incredible amounts of time into trying to automate that stuff - CppSharp, SWIG, and at least six others. There's just too much loss of information to make it work.

As far as JSON serialization, the easiest path would probably be to find something that works similar to Serde.rs and duplicate the types. Serde can't generate schemas or Swagger or anything yet, so my instinct would be to set up roundtrip tests between C# and Rust to ensure nothing is lost.

I only plan to automate the functions themselves. The structs and such I'd write by hand.

I plan to use JSON.NET under the hood, but for the best performance I don't want to use the reflection API.

I only plan to automate the functions themselves. The structs and such I'd write by hand.

Do you need me to make a C header file? What's the expected input and tool?

I plan to use JSON.NET under the hood, but for the best performance I don't want to use the reflection API.

Fair enough. I don't think reflection can handle discriminated unions in C# anyway (just F#) - so code generation might come in handy there.

I was overthinking this. C# needs to do very little parsing. Primarily, it needs to serialize - which is easier.

I could expose a 'mirror' API in libimageflow that just parses and re-serializes. If we disable whitespace and have the same rules for empty field omission, then it should be easy to compare the strings and verify that Rust is actually seeing everything that C# is sending. Thoughts?

Do you need me to make a C header file? What's the expected input and tool?

A C or even C++ header would be easiest to work with. The easiest tool to work with would be the P/Onvoke Interop Assistant. It's not perfect in its generation but put its output is a good starting point.

Fair enough. I don't think reflection can handle discriminated unions in C# anyway (just F#) - so code generation might come in handy there.

Not sure myself so a good point.

Here's a C header: https://gist.github.com/nathanaeljones/74f58b2d3aee3831eb3efdd316db77e0

All 4 structs are opaque. Keeping track of 4 kinds of pointers isn't hard, but I discovered today that P/Invoke supports typed pointers via unsafe struct. http://www.codeproject.com/Articles/339290/PInvoke-pointer-safety-Replacing-IntPtr-with-unsaf

Also, good reference on GC/Pinvoke race conditions: https://blogs.msdn.microsoft.com/bclteam/2005/03/15/safehandle-a-reliability-case-study-brian-grunkemeyer/ I like GC.KeepAlive() for it's explicit handling of the issue.

I don't have the interop assistant handy - can you let me know what that header file produces?

I personally create structs that wrap the pointers usually, since it removes the need for unsafe and can have helper code.

I had to modify the header file slightly for the tool to be happy. Two issues where that I didn't have the imports for the int types (so I just stubbed them) and that just doing empty struct declarations is not enough for it (so I just typedef-ed them as void*). The modified header and output are here

It looks like we're getting "ref IntPtr". Any idea why? Typedef? I can fully erase the types if you want.

I made the structs void instead of void* and that got fixed. Updated the gist

I've added automatic header generation as part of the build process. Here's one with types erased:

https://gist.github.com/nathanaeljones/023fe6f3e271d6e72cf0b8c3de5013b2

As I'm going through the C header I have some questions.

Why do you use uint8_t in some locations for strings and in others char, what's the logic here? Trying to figure out how to marshal and how to deal with on the C# side.

Another question: What will be the name of the native library? The file.

imageflow.dll on windows, libimageflow.so and libimageflow.dylib
elsewhere. Dllimport "imageflow" should work.

The latest AppVeyor build should have a working imageflow.dll

Thanks. And my other question?

It should only be relevant for C users; treat all strings and buffers as
unsigned byte arrays.

Ok. And strings are encoded as?

UTF-8/ASCII - everywhere except for imageflow_io_create_for_file, where filename needs to be the operating system code page used for fopen.

I'm writing some clarifications on strings. i8/u8/libc::c_uint8_t/libc::c_int8_t/libc::c_char are all signature compatible, but I didn't offer enough detail.

Unless there's a corresponding length, assume null-termination for strings in either direction. imageflow_context_error_and_stacktrace is the only function that both returns the length and writes a null-character/

Also, Happy Thanksgiving!

Also, Happy Thanksgiving!

Thanks, you too!

New ABI docs are uploading. c_char should now be used more consistently for null-terminated strings, u8 for length-only strings. UTF-8 except for the OS filename in imageflow_io_create_for_file.

Thanks. Makes that much easier to understand. Doing length only will be a pain in C# but can be done

Are all jobs blocking?

Ok, simple enough.

JobIO is a struct based on the Header but docs say it's an enum?

The name is currently overloaded; it's an opaque pointer.

That works

(Also, Rust enums are unions, not c enums)

As the Header has them as enums I can treat them as such?

Or at least document "correction"

Much clearer! Thanks.

May I suggest removing the C#/.NET references for how things work? While great for me, developers writing bindings for other languages might not get them and be confused.

I will definitely do that when we get other languages going.

I've just finished getting imageflow_tool.exe into a 'playable' state. Latest artifacts at https://ci.appveyor.com/project/imazen/imageflow

imageflow_tool examples --export-build-examples=all will create an examples folder with some scripts to run.

Would you mind giving that a look? I've written the last 4kloc on < 4hrs sleep, and would like another set of eyes on the tool before I publish the update.

I'll give it a try ASAP

Usability feedback also good. I.e, I should probably change examples --export-build-examples=all to examples --generate, eh?

Yeah

I'm setting up testing of bindings (where sufficiently quick) under the main CI. I'd like to figure out if C# would work in that env. Do you have a stub repo/empty project demoing the tools you'd need?

I'm going to put the latest ubuntu package - dotnet-dev-1.0.0-preview2.1-003177 in the docker instances; they're a pain to change, and it should work. It's about 150mb...

Been busy so I haven't had the time to work on this the past week. I should push my branch this weekend, though it's mostly pure interop code right now.

Is there (or will there be) a JSON schema for the JSON requests/responses?

I'm noticing things moving from imageflow to imageflow_core...?

Right now I don't have an automated way to generate a schema; the generated examples should always be correct, though.

What do you mean re: imageflow -> imageflow_core? I'm still porting C from imageflow_c to imageflow_core, but this shouldn't change anything ABI-wise - the ABI is already Rust.

Right now I don't have an automated way to generate a schema; the generated examples should always be correct, though.

I'll generate from the examples then for now.

What do you mean re: imageflow -> imageflow_core? I'm still porting C from imageflow_c to imageflow_core, but this shouldn't change anything ABI-wise - the ABI is already Rust.

https://s3-us-west-1.amazonaws.com/imageflow-nightlies/master/doc/imageflow/struct.Job.html moved to https://s3-us-west-1.amazonaws.com/imageflow-nightlies/master/doc/imageflow_core/struct.Job.html

Yeah, Job became a native Rust object (no longer a C thing). Still an opaque pointer. That's actually not why the docs changed; that was because I forgot something when I did that refactor (all re-exported types from a crate should show up in docs; they are duplicated between crates. Fixing..

Ah, ok. Just trying to make sure I'm replicating the API correctly :)

Thanks! I've merged it to master. (the dotnet branch is 100+ commits diverged).

I'd like to contribute some unit and integration tests. Do you like xUnit?

Apologies that I didn't have .gitattributes and .editorconfig set up for .xproj and .json files; I've fixed that now. You may have to delete and re-clone (or temporarily delete bindings/dotnet) from your working directory to avoid bugs in older versions of Git.

After that there shouldn't be any more line-ending pain. If you have the EditorConfig plugin for your text editor(s)/ides, there shouldn't be any churn whatsoever in your local diffs, either.

I have no issues with xUnit, though unit testing right now will be hard since there is very little functional code. It's mostly just pure P/Invoke code.

And no issue. I'll install the plugin.

How do we want to handle .NET documentation?

I've used Doxygen, Docu, and worked on a few different tools to try to blend generated API docs and prose well. Nothing really came to fruition. If you have a favorite tool, let's use that.

My only requirement would be that we can generate and upload from CI - so it would need a command-line tool.

It looks like Microsoft has moved to using DocFX: https://dotnet.github.io/docfx/ for ASP.NET Core, at least: https://github.com/aspnet/Docs/blob/master/CONTRIBUTING.md

I haven't used it, but it seems less painful that others I've worked with.

Could be bias on my part but SandCastle seems way easier

It's officially abandonware, but if you're willing to maintain it on CI, then I'm okay with it. Can we keep the run times under 2 minutes? Running it on ImageResizer was about 45 mins/build, for about 20kloc (at the time). Roughly 60 seconds per class. I'd hope it's faster now?

From Microsoft yes, but the community has picked it up https://github.com/EWSoftware/SHFB

Pushed to the dotnet branch an update that should build documentation using SandCastle

Could you rebase against master? dotnet is 175 commits behind. I've merged previous commits into master, but I didn't want to overwrite your branch.

Will do when I get home from work!

Done, though good lord did git fight me!

I did pull but git kept seg faulting!

ah

I have some info on packaging. It looks like Nuget 3.5 offers runtimes support.

I imagine we would set up a separate repository like https://github.com/natemcmaster/libsqlite3-package to fetch and combine the binaries.

Related:
http://www.natemcmaster.com/blog/2016/05/19/nuget3-rid-graph/
aspnet/dnx#402

If the SQLlite package works smoothly on .NET Core and .NET 4.6 in our testing (VS 2015.3 and command line), then I'd say we can just do what they're doing.

Caveat: Debug symbols (10-22mb+), but necessary for stack traces. On Windows, these are separate (.pdb), and we could probably publish a separate package for them, but on linux/osx, they're currently combined.

I'm thinking of removing the dotnet branch and just using branches for each change?

Will do!

Pushed how I'm thinking of parsing IoEnum. Thoughts?

It looks good to me. I really couldn't offer you any advice as far as performance goes without running profiling on it, and I'm not familiar with the latest JSON serialization techniques. You're definitely more qualified here.

I'll keep an eye on what you push and let you know if I see anything that looks problematic given current or future API plans.

Was more worried about my understand of the JSON format than performance.

Yeah, I saw. Two people brought us up in the comments

VS17 RTM and .Net Core Tools 1.0 are already released so maybe its time to migrate wrapper to MSBuild and Csproj?

VS17 RTM and .Net Core Tools 1.0 are already released so maybe its time to migrate wrapper to MSBuild and Csproj?

That's my weekend plan!

Woot! This is still under development: https://github.com/jaredpar/pinvoke

Also, the integration server is now running RTM 1.0.

@nathanaeljones That's awesome!

I'd like to spike a partial wrapper for Imageflow that doesn't do any abstractions at the P/Invoke layer, but just uses IntPtr. Do you have the raw output from a P/Invoke generation tool handy?

Not home tonight but tomorrow night I should be able to get that to you ๐Ÿ‘

About two months late but... https://gist.github.com/SamuelEnglard/557c7b549a2a30a245a1b8a21fd9d2a7
Includes C# and the C header used

Thanks @SamuelEnglard . I was curious on the project and checked in yesterday and saw there wasn't much movement lately. I'm interested in these C# bindings you've been working on. I wish I had time to test or help out. But I'm happy to see your commit.

@RossLieberman Yeah... Massive shake ups at work have me falling behind on lots of stuff. Finally trying to get back into it

Thanks for work on .Net bind @SamuelEnglard
When do u plan publish on nuget?

@CShepartd planning on publishing to NuGet once their is a public API. The current code is all internal. Can't really give a time line on that.