erlang/otp

Add is_external_fun/1 and is_local_fun/1 guards

starbelly opened this issue ยท 22 comments

Is your feature request related to a problem? Please describe.

No problems.

Describe the solution you'd like

The code exists in beam utilities to discern whether a fun is an external fun ref or a local fun. It would be quite nice to have these as guards. Said guards would mostly be convenience since we can get this information from erlang:fun_info/1 today. Though, there may be performance benefits to having a simple guard vs what erlang:fun_info/1 returns.

Describe alternatives you've considered

Simply use erlang:fun_info/1 in the body of a function required to check whether a function is external or local.

Additional context

In some situations one must validate that function is not a local function in order to ensure this is not slung around between nodes, as this can be a problem in a mixed cluster with different versions of OTP. In my case, we need to ensure callback functions in configuration are only ever external funs for this reason.

Additionally, I'd be happy to do the PR, but I figured an issue regarding the feature was appropriate first.

Creating only the guard function is_function_external/1 would avoid using "local" for use that doesn't relate to Distributed Erlang nodes.

I still have hope that Erlang/OTP might fix the fact that node/0 should not be a guard function, due to it not being a referentially transparent function (creating an error in the draft language specification). If that language bug gets fixed, that would likely lead to guards like is_local_pid/1, is_local_port/1, is_local_reference/1, is_remote_pid/1, is_remote_port/1, is_remote_reference/1 (as described at #5568).

Additionally note, while I put is_external_fun/1 in the issue name, this may be too broad, my original thought was is_ext_fun_ref/1 it could also be is_external_fun_ref/1. Also, If we just had guard to to indicate whether a term was a ext fun ref or not, the need for is_local_fun/1 becomes less useful (at least I have no use case for this).

I agree that this would be useful. I'd like to change the names however (and not just here, though the ship might have sailed already) as "local fun" and "external fun"/"remote fun" might make people think about distribution, when it's more about how the fun relates to its module: the former being a fun tied to a specific module and the latter a fun tied to an MFA.

Yes, I agree this would be very helpful to have, and I also agree that we could likely do better with the naming.

I also thing this should be coupled with having more places in the system use such funs instead of MFA tuples - with the recent changes they are smaller in memory, and also integrate far better with various analysis tools - typecheckers, dialyzer, static analysis, IDE tooling (go to definition navigation, etc), linting...
Using actual closures though in some places is not desirable since they prevent code upgrades and cross-node evaluation to work transparently (unlike MFA tuples), so being able to easily verify and specify what type of function is desirable would be useful in those cases.

What about is_export_function/1,2 (arity 2 for the fun's arity, compare to is_function/1,2) since it relates to an module's export table and not the Erlang Distribution protocol.

There are, as said, important cases where you need the restriction that a fun() has to be an export entry fun, but rarely the opposite. Therefore I think that the module local variant isn't needed, since it in those rare cases be accomplished by combining is_function/1,2 and is_export_function/1,2.

@RaimoNiskanen is_function_exported/1,2 seems more natural to me, but I also always try to use common prefixes to group ideas (when I am allowed to). Either way, it sounds like a good approach.

I don't think it's common here. A "local" fun is not tied to a particular instance of a node, and works just fine everywhere as long as the required code is available. "Local" refers to the fun's relation to a specific module, which is relevant in non-distributed scenarios (code upgrades, funs persisted to disk or similar).

I'd float some ideas is_free_function or some sort of super-verbose is_safely_shareable_function; or opposite is_module_bound_function or just is_bound_function

Floating on...: is_function(F, export), is_function(F, export, Arity).
Or promote fun_info/2 into a guard so you can write fun_info(F, type) =:= external.

I like the idea of promoting fun_info/2 into a guard, as it covers cases in the future that might not be thought of today. I do rather like is_bound_function/1 more or something along those lines as it is succinct and can be documented simply (i.e., takes a function reference, returns true if it is bound to a module, otherwise false).

Coming back to this, I think is_bound_function/1,2 doesn't tell the whole truth here, I believe we can technically say funs evaluated via erl_eval are bound to that module, as such is_export_function/1,2 or is_exported_function/1,2 makes more sense. Maybe there's more ideas...

I was thinking about this a bit more. I think using the term "exported" will be problematic given the semantics of the already-existing erlang:function_exported/3. In particular, it verifies the function actually exists (and the module is loaded). In this case (I presume) we want a guard that succeeds even if the module is not loaded and doesn't actually check the function exists - only that it's the "external" format of the fun syntax.

@michalmuskala I agree with that and your presumption at least as far as what I had in mind was definitely not to check if the code was loaded.

I think using the term "exported" is valid because it is describing a future expectation that the function must be exported to be used, otherwise it will be considered undefined (i.e., an undefined function error occurs). The existence of the erlang:function_exported/3 does make the situation a little unusual if a guard function like is_function_exported/1,2 exists too, but the documentation can describe how the is_function_... guard function is only checking the function type and not whether the function's module is loaded.

@okeuday I definitely don't think you are wrong here, especially coupled with what @RaimoNiskanen stated, that this is related to the export table on a module. I think part of it for me is what's easier on the eyes and would what result in less ambiguity, the head of clause should be enough to disambiguate.

What about a happy medium between promoting fun_info/2 to a guard and is_exported_fun/1 (or some variation of), such is is_fun_type/2 ? I'd rather have a single arity guard, but that would be nicer than having to do a abs equal comparison in the case fun_info/2 was promoted to a guard.

@starbelly People haven't liked fun_info/1 in the past. The documentation says it is only meant for debugging and the function returns more than you normally need, in proplist format due to its age while most people would likely prefer a map now. Either way, the return value of fun_info/1 isn't friendly for guard expressions. The fun_info/1,2 type value as local or external isn't common and isn't represented in a type (it shouldn't be used more than it is meant to be used, for debugging). That makes fun_info/2 a bad choice as a guard, partially due to its association to fun_info/1.

That is why I saw this as a new guard function to associate with the existing is_function/1,2 guard functions, except providing true for a subset of the types that are true for is_function/1,2. That would make it a variation on the is_function/1,2 guard functions, so to communicate that variation I think it is best to use a suffix describing the variation. The simplest name using common terminology seems to be is_function_exported/1,2 because -export([...]). is always present in a module (is_function_export/1,2 is possibly not treating export as a verb and it is easier to understand export in the past tense, in the is_function_... use).

Using suffixes to describe variation is a little odd for people that want to write function names as English sentences because using suffixes for the variation should be equivalent to a subject-object-verb word order. Using subject-object-verb word order is most common among natural languages though, so it should be how most people naturally think.