tidyverse/magrittr

Feature req: huge potential - hijackability of pipe operator like any other method

stemangiola opened this issue · 9 comments

For any method we can do

method.my_class = do something

method.default = do something the original method would do

if the pipe operator could be hijacked in this way also (rather than be protected by unexported function) there would be a huge potential of converting classes/objects on the fly depending on the destination function. This is huge, for example in my effort to port many genomic packages to tidyverse. Instead of rewriting all tidyverse functions to catch a new incompatible class, I could check if the destination function is within tidyverse and pass the object through a parser first that would convert it into a tibble.

This would open huge unimaginable opportunities of integrating the R universe.

For example


%>%.default = do the usual

%>%.my_class = 
    if(destination function %in% c("tidyverse list")) convert_into_tibble(my_object) %>% destination_function()
    else my_object %>% destination_function()

Have a look at %$%, which employs the generic with under the hood. Example:

with.gg <- function(p1, p2) p1 + p2

ggplot(...) %$%
  geom_this(...) %$%
  facet_that(...)

Interesting. Probably the biggest limitation is community-related. It is really hard to say "use my package but change all pipes to %$% from now on".

%>% is so wide-spread.

Good to know anyway. I would seriously consider to use a analogous system for %>%. It would open a new world, its benefit of readability would almost be secondary compared to the integration potential (seamless adaptation of "any" function with "any" class).

By the way, am I wrong or a hijackable %>% would also solve the problem of the incompatibility of + and %>% for ggplot

 %>%.ggplot {
    if(destination_function %in% c("geom_", "facet", "theme", etc..)) function(p1, p2) p1 + p2
    else function(p1, p2) p1 %>% p2
}

Allowing something like

ggplot(...) %>%
  geom_this(...) %>%
  facet_that(...) %>%
  ggsave()

As you say %>% is so widespread that changing it fundamentally like that, I'd say, is not a great idea.

The %$% is already available and signals that something is different from standard behaviour (which is a good thing).

If you want a dsl in a package (say as in ggplot) where %>% is not sufficient, all you have to do is implement with.your_class, re-export %$% and everything is well.

Users will get the functionality and will be able to explicitly see that something is a bit different from the usual %>% and be informed.

I see your point. I don't want to keep banging but I'll give my last opinion.

A function that takes a class as input and give another as output, happens all the time. But this is not a good reason to hardcode function_itChangesTheClass to the name of the function (if you know what I mean). The users just don't care in practical terms.

Same thing for %>%. No big deal if magically makes the destination_function operates on a class was not designed for. Win-win for developer and users.

%>% for the majority of users is just a thing that transfers the output/input, if it happens in some special cases to change the class would not change the perception of it, it's up to the developer to make that clear (if it is worth it at all, see above).

Putting it in an extreme way (just to give the point) I don't think %$% would ever replace practically %>% for users, with the result that the power of %$% will not be exploited by the community.

Making %>% flexible will be a game changer for developers, and cost nothing for the users.

My 2 cents :)

I think it would be a mistake to make %>% generic. It should be optional syntax sugar so that you know what to do when switching from pipe to nested form. Developers should design pipable APIs.

Developers should design pipable APIs.

Can you please elaborate? (I might learn something new)

unless I am missing something, the communication between pipeable APIs can be "broken" by incompatibility of input/output formats, which conversion could be included in the %>% itself. This would apply to really field-specific cases (e.g., genomics and tibble formats), which the result of huge abstraction of data containers, from UI.

But anyway thanks guys for taking time to reply, feel free to close.

Can you please elaborate?

Just that the functions should take data first.

the communication between pipeable APIs can be "broken" by incompatibility of input/output formats, which conversion could be included in the %>% itself

Either there should be internal coercion in the functions themselves, or external conversion by the user. It's not the place of the pipe to convert data. It should only pass data.

Thanks for the suggestion!