dart-lang/sdk

dart:ffi generate Dart bindings from C headers

dcharkes opened this issue ยท 22 comments

Maintaining bindings by hand is error-prone, especially across different versions. We should have a tool that can generate Dart bindings from C Headers.

Update: see https://pub.dev/packages/ffigen.

Would be great to investigate support for SWIG

Hi, do we have rough estimation when can we have the first preview of this? within a month, a quarter or a year?

Unfortunately we aren't planning to ship this feature in the first release, but it's possible to implement this functionality as tool outside the VM. For example, you could bind to libclang and use source_gen to generate the bindings.

Thanks, your suggested solution seems to be interesting to be taken by the community ๐Ÿ˜„

@devkabiir

The dart-only approach I suggested tries to solve the boilerplate problem for dart bindings that are hand-written. It will generate the code that will do things like automatic conversion of dart String to Pointer<Utf8> or similar depending upon annotation details

This is the kind of boilerplate code that would be generated

class _$Coordinate extends Coordinate{

	factory _$Coordinate(double latitude, double longitude) {
	  final createCoordinatePointer = dylib
	      .lookup<ffi.NativeFunction<create_coordinate_func>>('create_coordinate');
	  final createCoordinate =
	      createCoordinatePointer.asFunction<CreateCoordinate>();
	  final coordinatePointer = createCoordinate(latitude, longitude);
	  return coordinate = coordinatePointer.load();
	}

}

The lookup and asFunction should happen top level, no need to re-execute that every factory call.

A user would only have to do Coordinate(1.0, 2.0) to get an instance.
It should be possible to infer all the necessary information from dart-code and annotations to generate such code from the snippet I wrote in the earlier comment.

I think it should be even possible from the .h-file.

It will also inline with the already existing builders and the build_runner package.

Yes, that's nice. (As mentioned before, I think building generator in the first place is the first step, the second step would be to hook it into the package:build_runner.)

(Builders (package:build) can actually take any arbitrary file and produce output for it not just dart-files,
The only restriction is that a builder's outputs should be known beforehand.
So a builder can register it's inputs as *.h and outputs as *.dart.)

In this approach where inputs are hand-written bindings .dart files and outputs are .g.dart files that contain code similar to above code, generator/builder would need to be able to:

  1. Require hand-written dart code with annotations
  2. parse, validate, understand dart code (which is already possible)
  3. have a mapping of how certain things in dart like class Coordinate {} can be bound to c code like struct Coordinate {}. This is sparsely documented but it is already possible based on the samples

In the approach where .h files are converted to .dart files, a generator/builder would need to be able to:

  1. parse .h files
  2. validate the parsed input

We should reuse clang for this. And require users of your package to have clang installed when they want to generate the dart bindings.

  1. construct a meaningful intermediary representation (e.g. AST)

This is extra work, I agree.

  1. Have a mapping of how certain things from .h are to be represented/converted to .dart syntax and also write the bindings in dart. So struct Coordinate is to be written in dart like class Coordinate{} along with the necessary code to bind those two together

Yes.

If we opt for a dart->dart generator, the users of your package would have to do this step manually over and over again.

A user of these bindings would prefer a fluent api like new Coordinate(1, .556) instead of new Coordinate.allocate(1, .556) or createCoordinate(1, .556) or new Coordinate.createCoordinate(1, .556). This cannot be done without some human intervention or if there is a spec that the header file follows (which hardly ever is)

I think it makes perfect sense to prefer a fluent API. :) We can sollicit input here from potential users what they would like to see as an API, but as a package developer your preference is very welcome :)

A generator that parses .h and produces .dart files might not be able to produce such fluent api. It would just be a dumb generator (even thought it's smart enough to convert .h to sound dart code)

I'm not sure why it should not be. Can you elaborate or give an example?

Thoughts?

P.S. I read my own comment and it looks like I favor one approach over the other. This is not the case, I'm still very much in the discussion phase.

See my comments inline.

I think I'd prefer generating from .h files, because that would require package users to do nothing besides pointing the builder to the .h-files. Unless I'm missing something, and the .h-files miss information that has to be specified in a .dart file.

Inspiration from Kotlin:

Kotlin/Native comes with the cinterop tool; the tool generates bindings between the C language and Kotlin. It uses a .def file to specify a C library to import. More details on this are in the Interop with C Libraries tutorial. The quickest way to try out C API mapping is to have all C declarations in the interop.def file, without creating any .h of .c files at all. Then place the C declarations in a interop.def file after the special --- separator line:

headers = lib.h
---
โ€‹
void pass_string(char* str) {
}
โ€‹
char* return_string() {
  return "C stirng";
}
โ€‹
int copy_string(char* str, int size) {
  *str++ = 'C';
  *str++ = ' ';
  *str++ = 'K';
  *str++ = '/';
  *str++ = 'N';
  *str++ = 0;
  return 0;
}

Source: https://kotlinlang.org/docs/tutorials/native/mapping-strings-from-c.html

The reason for .def files: several options for adjusting the generated bindings.

  • excludedFunctions property value specifies a space-separated list of the names of functions that should be ignored. This may be required because a function declared in the C header is not generally guaranteed to be really callable, and it is often hard or impossible to figure this out automatically. This option can also be used to workaround a bug in the interop itself.

  • strictEnums and nonStrictEnums properties values are space-separated lists of the enums that should be generated as a Kotlin enum or as integral values correspondingly. If the enum is not included into any of these lists, then it is generated according to the heuristics.

  • noStringConversion property value is space-separated lists of the functions whose const char* parameters shall not be autoconverted as Kotlin string.

Source: https://github.com/JetBrains/kotlin-native/blob/master/INTEROP.md#definition-file-hints

For people interested in generating bindings for Objective-C libraries, it might be worth mentioning that cupertino_ffi has necessary helpers, but it still requires Dart SDK to implement #38578 .

Project Panama (Java's new FFI) is also heavily invested in generating bindings from header files.

Sources:
https://www.youtube.com/watch?v=M-FPNBFAoSo
https://www.jcp.org/aboutJava/communityprocess/ec-public/materials/2019-03-12/Project_Panama_Status_Update_March_2019.pdf
@dcharkes
Is Jextract equivalent to Swig? Is it a better alternative to implement Swig integration for Dart rather than the approach mentioned by @sjindel-google ?

Jextract is part of project panama, and Swig is a stand alone project, so they are definitely not equivalent. Jextract takes in an unmodified h file (see these samples). Swig takes in a swig file, but that can sometimes be just an h file:

SWIG for the truly lazy
As it turns out, it is not always necessary to write a special interface file. If you have a header file, you can often just include it directly in the SWIG interface.

http://www.swig.org/tutorial.html

I think both approaches are fine.

However, both approaches will be very different to implement:

  • Binding to libclang will be writing a dart package, so all Dart code (including some bindings to be able to invoke libclang from Dart).
  • Extending Swig with Dart support which is a lot of C++ programming with the swig internal data structures (documentation for adding a new language to swig).

Jextract is part of project panama, and Swig is a stand alone project, so they are definitely not equivalent. Jextract takes in an unmodified h file (see these samples). Swig takes in a swig file, but that can sometimes be just an h file:

SWIG for the truly lazy
As it turns out, it is not always necessary to write a special interface file. If you have a header file, you can often just include it directly in the SWIG interface.

http://www.swig.org/tutorial.html

I saw a few examples of using Jextract and SWIG (python) and both involved generating bindings from headers. So that made me think that they are doing the same thing. Well, now the difference is clear.

I think both approaches are fine.

However, both approaches will be very different to implement:

  • Binding to libclang will be writing a dart package, so all Dart code (including some bindings to be able to invoke libclang from Dart).

I'd love to do that. Is somebody already working on this?

  • Extending Swig with Dart support which is a lot of C++ programming with the swig internal data structures (documentation for adding a new language to swig).

Thanks for the insights @dcharkes .

Hi, I'm working on the bindings to libclang.
However, according to #41062 and #36730 , we still cannot pass structs or get returned structed by value. A lot of necessary APIs in libclang require value type parameter or return value.
I will work on the APIs that are not related to this issue first. Or we have to write a C wrapper for libclang, which sounds not a good way.

Hi @ctrysbita ,
I was in the same situation and @dcharkes suggested writing the C wrappers. Also, you can take a look at this repository by PixelToast which currently generates a JSON intermediate representation and uses it to generate the bindings to libclang. The project has some nice plans :D

@bitbeast18 That means user need to compile C wrapper manually. Maybe wait for official support of pass-by-value is better XD. By the way, I think pub may need functions to handle native code building like node-gyp?

I think pub may need functions to handle native code building like node-gyp?

See #36712 for the current "solutions" for shipping native code. Yes, we definitely want a better solution for pub!

Or we have to write a C wrapper for libclang.

That is the way to go for the time being.

Maybe wait for official support of pass-by-value.

I'm working on it, but it will take some time.

I've temporary solved it by writing a wrapper for the functions return/need a value. Now these necessary functions from libclang work well and we can parse C/C++ code and traverse through AST from dart side.

However, the native code ships together with package means every user have to build native library manually after pub get.
Does pub allow custom scripts after pub get so that we can build native code automatically?
Or we have to use pub run or build_runner to build them?

However, the native code ships together with package means every user have to build native library manually after pub get.
Does pub allow custom scripts after pub get so that we can build native code automatically?
Or we have to use pub run or build_runner to build them?

Yeah, it's either letting the users building the native code. Or committing the built binaries in the package for a range of OS/hardware combinations. See the discussion in #36712 (comment).

I'm working on binding of native. But it depends on dart_native, which is a bridge between dart and iOS/Android using dart ffi.

@mannprerak2 will work on this as a Google Summer of Code project! ๐ŸŽ‰

Stay tuned for more!

gokr commented

As a user of Nim also, the c2nim tool is pretty neat and can also serve as inspiration. It takes a C header (or C file actually) and produces a Nim file from it. You can modify the header files using #ifdef C2NIM etc to add conditional logic. It parses a large subset of C, and has specific features to deal with C macros. I have used it recently to generate a big wrapper for the ORX game engine.

An experimental FFI binding generator is now available: https://pub.dev/packages/ffigen! ๐ŸŽ‰

Please check it out and leave your feedback and suggestions in the issue tracker: https://github.com/dart-lang/ffigen/issues.

We have released package:ffigen 1.0. I'll close this issue, please file any issues you have with ffigen on that repository.