Open-EO/openeo-r-client

integrate R-UDFs into R-Client

Closed this issue · 6 comments

Adapt parameters of run_udf:
Here's the possibilities from the process spec -> and current backend availabilites for supporting them:

  • A) Text -> should work
  • B) R-File/py.script/txt -> no location for storing R-Files on backends
  • C) URL -> should work
flahn commented

c): I think ultimatively this would be argument type URL which simply needs to be defined as "allowed" parameter for UDF Code. Otherwise I cannot differentiate the two.

flahn commented

The process definition at openEO processes states three types of obects that are allowed for parameter udf

  1. URL
  2. File-Path on the server
  3. UDF-Code as String that contains line breaks (@m-mohr character encoding is not stated explicitly)

For me this means that I need to load a local file into a String to cover B). The URL is already covered, because this is a separate Argument in the package. As there is no file upload a direct file path on the server does not make sense from a user perspective. However, the argument should be implemented in R regardless.

  • load local file to string
  • implement FilePath argument

Line break here means: (\r\n|\r|\n) (Windows/Mac/Linux)

Strictly speaking, you can simply pass through what the user provides and I'd assume that this is already possible as these are just three strings. Everything on top is really just a convenience function to the user, so either uploading a file to the server or reading it from disk and passing it as string. This should likely be something the user needs to trigger manually (e.g. via a wrapper class or new parameter) as otherwise you can't distinguish what a "path" in a string really is.

  1. URL (pass through) - run_udf(udf = "https://example.com/udf.r")
  2. File-Path on the server (pass through) - run_udf(udf = "udfs/udf.r")
  3. UDF-Code as string (pass through) - run_udf(udf = "# my udf code\r\nrun_this()")
  4. Local UDF-File (convenience) - e.g. run_udf(local_udf = "udfs/udf.r") or run_udf(udf = new(LocalUdf, "udfs/udf.r"))

Whether to inline the string or to upload, you can likely detect from the endpoints. If file uploads are implemented, upload it. If file uploads are not implemented, inline it? Or make it explicit via argument... run_udf(inline_udf = "udfs/udf.r") or run_udf(upload_udf = "udfs/udf.r")

Just some ideas :-)

flahn commented

Thanks Matthias, most of it was already working, but needed some refinement. However, in the past I had sometimes difficulties with text files and their encoding like UTF-8, ISO-8859-1, etc. Is it set to UTF-8 as default?

Oh, yeah, we don't really specify that. Good point for a clarification in the API.
I'd clearly go with UTF-8 by default. Otherwise, content negotiation is available in HTTP.

flahn commented

Currently, the UDF Code can be stated as function, local-file, URL or code as text. Essentially, file paths on the back-end are probably indistinuishable from local files (especially on local UNIX instances), therefore we will use a file-path if the file is not found locally.