/R-trident-wrapper

Generate R function wrappers for Trident.

Primary LanguageC#Apache License 2.0Apache-2.0

R-trident-wrapper

Some basic code to demonstrate the generation R function wrappers for Trident.

I wrote this code for a project back in 2010 and have not looked at it since. It was written as a prototype that I was hoping would be picked up by the developers who were responsible for providing access to R from Trident... but they showed no interest in doing that since '.NET is the only language for doing anything' and it did not happen (the whole project was canned in the end due to the clunky support they wound up providing for R and Python - the clients languages of choice).

I though I'd dump it here on the off chance that it might be of use to someone, but will not vouch for its correctness, and having not kept up with where Trident is at, am not even sure whether it will work anymore.

Files

  • func_analyser_cs.R This generates the C# code for an activity. More detailed description is in the code, though if you understand the S3 class mechanism is is straight forward,

  • func_analyser_xml.R This generates an XML document describing the arguments and return types for a function (logic is the same as func_analyser_cs.R, just the output changes). It may be easier to understand the logic from this code rather than func_analyser_cs.R.

  • Program.cs This shows how to call Python and R from .NET, and conversion of Python and R types to .NET types. The Classes in this code should remove the need to serialise intermediate outputs between activities, and expose them directly as .NET Objects. This is achieved through "rpy2" and "Python for .NET". The class "Program" gives the examples, the classes 'rnorm' and 'lm' are 'wrapped' R functions. 'ManagedObjectExtractor' and 'R' are the workhorses.

  • func.cs Example Code generated by func_analyser.R. I have not been able to compile this code as I only have visual c# express, not professional. As such, I imagine that it may have a few issues.

Dependencies

  • "Python for .NET" needs to be installed. see pythonnet.sourceforge.net

  • rpy2 needs to be installed. see rpy.sourceforge.net/rpy2.html

Notes

  • The xml generation needs improvement, then a tool to generate *.cs files from the xml could be used (meaning the same tool could generate R and Python wrappers).

  • Since R is dynamically typed, I cannot determine what the types of the arguments being passed to a function are (an argument to an R function can - and very often does - accept many types), Hence I pass them as .Net objects in the wrapper. I think there is an easy way of doing the conversion on demand, but have not implemented/tested it... more fun for you.

  • Because of the argument type problem noted above, information about the type of arguments should be included in the xml schema, along with documentation and default values (though I can process default values, even of type 'function' and 'language'... but telling you how to do this would take all the fun out of it for you). However, one could insist that default values be provided for all arguments, which would allow for the automatic generation of more specialised xml schema or C# code.

    In either case, dynamicaly typed languages offer far more flexibility than staticaly typed languages and hence we need to restrict R in order to make it work in trident.

  • In the code provided I treat each column of a dataframe as a separate result. This was a poor choice, since the number of columns will vary in many cases. Any R object can be passed between activities in 'raw form' - i.e. the PyObject wrappers are transparent to R and type conversions etc. are automatic. Another (better) solution would be to create .NET classes like DataFrame, Matix etc, that provide access to the underlying objects (by aggregating the PyObject wrapper and accessing through methods like GetColumn, GetRow, GetAtIndex etc).

  • Note that the .NET class R (in Program.cs) is a Singleton. The same approach can be used for a GeoProcessing object (or any embedded 'interpreter', utility etc.), avoiding the need to pass the geoprocessor between activities. This would make for cleaner workflows.

  • I'm not sure of why properties are saved in base Activity classes (perhaps it is something to do with provinance, generating the interface, ...), or if this is required. Sensible idioms for storage need to be adopted.