gdemin/excel.link

Use active bindings

stefanfritsch opened this issue · 8 comments

Hi,

great package. I know a lot of people that might try R if it was integrated into Excel.

An issue I have is that currently the output of xl.connect.table don't really behave like other R objects and I wanted to ask if you had looked into using active bindings instead:

DT<-data.frame(x=1:10, y=2:11)

makeActiveBinding(
  "DF",
  function(.new_value=NULL)
  {
    if (is.null(.new_value)) {
      return(DT)
    } else {
      DT<<-.new_value
    }
  },
  parent.frame())

DF
DF[1,1]<-400
DT

This means that the value of xl.connect.table can't simply be assigned to a variable but you could create something akin to pryr's %<a-% operator:

DF %<xl-% a1
DF %<xlc-% a1:b24

Thx
Stefan

This would be an example version:

`%<xl-%`<-function(.x, .value)
{
  ExcelObject<-xl.connect.table(.value)

  assignment<-function(.new_value=NULL)
  {
    if (is.null(.new_value)) {
      return(ExcelObject[])
    } else {
      dims<-dim(ExcelObject[])
      ExcelObject[seq_len(dims[1]), seq_len(dims[2])]<<-.new_value
    }
  }

  makeActiveBinding(
    deparse(substitute(.x)),
    assignment,
    parent.frame()
  )
}

It's very slow because replacing the whole region is very slow, I hope that can be fixed. :)

Hi!
Thank you for you suggestions.
Do I correctly understand that you want to have possibility to bind excel range to symbols in R?
Something like this?

binding_to_range  %<xl-% a1:a5 # bind to a1:a5
binding_to_range # print values of Excel range
binding_to_range <- 45 # fill range with value

I've modified your code for better performance:

`%<xl-%`<-function(.x, .value)
{

  .value = deparse(substitute(.value))
  assignment<-function(.new_value=NULL)
  {
    if (is.null(.new_value)) {
      return(xl[[.value]])
    } else {
      xl[[.value]] <-.new_value
    }
  }

  makeActiveBinding(
    deparse(substitute(.x)),
    assignment,
    parent.frame()
  )
}

# bind a1:c3 excel range
binding %<xl-% a1:c3 
# read range
binding
# write to range
binding <- matrix(1:9,3)

Hi,

yeah, that is a lot faster. :)

I did a slight modification to allow for the four xl* functions. It would be nice if there was a way to select the current region the way the first function did it, though.

But xl.current.region() doesn't work the way the xl* functions do, afaik?

AssignmentGenerator<-function(.xlfct, .env)
{
  # evaluate to ensure it's in the local environment.
  # Otherwise lazy eval would prevent that.
  .xlfct

  env<-parent.frame()

  function(.x, .value)
  {
    .x<-deparse(substitute(.x))

    # allow specification of rhs in quotes, "a1:b100", or not, a1:b100.
    .value<-substitute(.value)
    if (!is.character(.value))
      .value<-deparse(.value)

    # makeActiveBinding doesn't overwrite non-active bindings
    if (exists(.x, env))
      remove(list=.x, envir=env)

    # assign active binding:
    assignment<-function(.new_value=NULL)
    {
      if (is.null(.new_value)) {
        return(.xlfct[[.value]])
      } else {
        stopifnot(is.data.frame(.new_value),
                  dim(.xlfct[[.value]]) == dim(.new_value))
        .xlfct[[.value]]<-.new_value
      }
    }

    makeActiveBinding(
      .x,
      assignment,
      env
    )
  }
}

`%<xl-%`<-AssignmentGenerator(xl)
`%<xlc-%`<-AssignmentGenerator(xlc)
`%<xlr-%`<-AssignmentGenerator(xlr)
`%<xlrc-%`<-AssignmentGenerator(xlrc)

For current region:

CurrentRegionAssignmentGenerator<-function(row.names = FALSE, col.names = FALSE, na="", drop = TRUE)
{

  env<-parent.frame()

  function(.x, .value)
  {

    .x<-deparse(substitute(.x))

    # allow specification of rhs in quotes, "a1:b100", or not, a1:b100.
    .value<-substitute(.value)
    if (!is.character(.value))
      .value<-deparse(.value)

    # makeActiveBinding doesn't overwrite non-active bindings
    if (exists(.x, env))
      remove(list=.x, envir=env)

    if (row.names){
      if(col.names){
        .xlfct <- excel.link::xlrc
      } else {
        .xlfct <- excel.link::xlr
      }

    } else {
      if(col.names){
        .xlfct <- excel.link::xlc
      } else {
        .xlfct <- excel.link::xl
      }        
    }

    # assign active binding:
    assignment<-function(.new_value=NULL)
    {
      if (is.null(.new_value)) {
        xl.current.region(.value, row.names = row.names, col.names = col.names, na = na, drop = drop)
      } else {

        xl()$Range(.value)$CurrentRegion()$Clear() # to prevent overlapping with old values
        .xlfct[[.value]] <- .new_value

      }
    }

    makeActiveBinding(
      .x,
      assignment,
      env
    )
  }
}


`%<cr-%`<-CurrentRegionAssignmentGenerator(row.names = FALSE, col.names = FALSE)
`%<crc-%`<-CurrentRegionAssignmentGenerator(row.names = FALSE, col.names = TRUE)
`%<crr-%`<-CurrentRegionAssignmentGenerator(row.names =TRUE, col.names = FALSE)
`%<crrc-%`<-CurrentRegionAssignmentGenerator(row.names =TRUE, col.names = TRUE)

Hi,

(1.) as far as I see it your last function only works if .value is the "upper left corner" of the region.

e.g.

xl[a1]<-iris
xl.current.region("b4")

would return the full table but it would result in an incorrect result at .xlfct[[.value]] <- .new_value.

Would it be hard to write a function that returns the current region as a string? e.g.

> GetCurrentRegion("b4")

[1] "a1:e151"

Then you wouldn't even need a different generator function but just a parameter that checks whether .value should be taken as is or if it should be translated with GetCurrentRegion.

(2.) On a different note, the dimension check in the other function, dim(.xlfct[[.value]]) == dim(.new_value) probably creates more problems than it solves. I put it there because removing rows leads to strange errors (because those rows will still exist in Excel). A better solution is probably to store the size of the data.frame in the environment of the generator (i.e. where env<-parent.frame() sits now and to update that size if necessary.

For that it would be nice if there was a function to translate excel strings to rows and columns and back (if there isn't one already :)

> TranslateExcel("a1:e151")

$rows
[1]   1 151

$columns
[1] 1 5

> TranslateExcel(.to.excel = list(c(1, 151), c(1,5)))

[1] "a1:e151"

Hi,

as far as I see it your last function only works if .value is the "upper left corner" of the region.

Yes, it's a bug. Moreover, there is another serious bug in the above functions (or one may consider it as bad feature). They bind to R symbol character range name so if we bind, for example, range 'A1' then after changing active sheet binding will point to 'A1' on another sheet. To rewrite these functions in appropriate manner I need to use internal package functions. I will try to release next package version with all fixes (including TranslateExcel) at the beginning of July.

A better solution is probably to store the size of the data.frame in the environment of the generator (i.e. where env<-parent.frame() sits now and to update that size if necessary.

Yes, it seems it is most consistent behavior. Also I think one should clear Excel range before resizing so there will be no garbage from old data.

Updated version (0.9.1) on CRAN now.
Added the following functions:

xl.bind.current.region
xl.bind.range
%<cr-%
%<crc-%
%<crr-%
%<crrc-%
%<xl-%
%<xlc-%
%<xlr-%
%<xlrc-%
xl.binding.address
xl.index2address
xl.address2index

And some other improvements.