Gradient Descent

  • This is a method for approximating functions
  • It uses small movements in the opposite direction to the gradient.
  • The gradient is the derivative of the function
  • So, in order to do this, we need to be able to differentiate the function.
  • We’ll give this a shot in R.

Calculus

where_number <- function(expr) {
    pos <- grep("[0-9]+", x=expr)
}

where_variable <- function(expr) {
    pos <- grep("[A-Za-z]", expr)
}

where_operator <- function(expr) {
    pos <- grep("[+-^/]", expr)
}
##' Convert an unquoted expression to a vector of text
##'
##' Interestingly enough, R seems to add spaces around + and -, but not around exponentiation i.e. ^. I have no idea why this is. 
##' @title expression_to_text
##' @param expr an unquoted expression
##' @return a vector of characters representing the expression
##' @author richie
expression_to_text <- function(expr) {
    exprtext <- deparse(substitute(expr))
    expression_vec <- unlist(strsplit(exprtext, ""))
}

deriv <- function(expr) {
 #todo
    
}

So, above we handle the different kinds of values we can expect in our text. Right now, I only recognise addition, subtraction, multiplication, division and exponentiation. This is good enough for the time being (though clearly I’d want to at least be able to evaluate sin and cos at some point).

What I was going for here was that I could just pass R an unquoted expression of the form 20x^2+30x+3, but that didn’t work, because R couldn’t handle that kind of form without a global definition of x (and even then, the syntax wouldn’t work).

So this is needlessly complicated for something that didn’t work. A better implementation is below.

##' Convert mathematical expression stored as string  into its component parts
##'
##' Right now just splits on + and -
##' @title expression_to_text
##' @param string 
##' @return a vector of characters representing an equation
##' @author richie
##' @export
expression_to_text <- function(string) {
    res <- stringr::str_split(string, "\\+|-")
}

Next, I need to implement the rules of differentiation for these kinds of functions.

So all I really need is the exponentiation one, the variable one and the constant one.

diff_constant <- function(expr) {
    res <-  0
}
diff_variable <- function(expr) {
    
}

I ended up not using these, and representing each element as a polynomial.

So, I need to rethink my design. I need to split the function into powers, variables (x^1) and constants.

20x^2+30x+1

So, conceptually, that’s three expressions, each handled by a different rule. I feel like I could just handle the x to the power of 1 case along with the more general case.

So, the polynomial part is a transform of x^n -> nx^n-1

diff_poly <- function(base, exp) {
    newexp <- exp - 1
    res <- paste0(exp, "*", base, "^", newexp)
    }

So, the function above neatly returns a string with the result of the evaluated expression. We cheated though, and just manipulated strings. We don’t actually have strings of that form available, which is our next step.

This means that I should be representing every part of the expression in the form constant, varable and exponent.

I think that I went too general with my early expression function, it seems much better to split on + and - The variables are kinda annoying. I think that I’ll create a simple class to hold them

setwd('~/Dropbox/Code/Stats/polynomial/')
devtools::setup(".", rstudio=FALSE)
devtools::use_build_ignore("^#")
devtools::use_build_ignore("build_poly.R")
devtools::use_testthat()
devtools::use_package("stringr")
devtools::document()
devtools::check()
devtools::build()
devtools::install()

Polynomials

##' An S4 class for Expression objects 
##' @slot coefficient - an integer representing the coefficient
##' @slot variable - the indeterminate in the equation
##' @slot exponent the exponent portion of the object
##' See above
##' @export
setClass("Exp", slots=list(coefficient="numeric",
                                           variable="character",
                                 exponent="numeric"))
##' Convert a string to a polynomial
##'
##' Code assumes strings of the form 20x, 10^2 or 2xy^2
##' @title to_expression
##' @param string 
##' @return a polynomial object
##' @author richie
##' @export
to_expression <- function(string) {
    var <- stringr::str_extract(string, "([A-Za-z]+)")
    coeff <- stringr::str_extract(string, "([0-9]+)")
    message("var is: ", var, "\n", "coeff is: ", coeff)
    if(grepl("\\^", x=string)) {
        exp <- stringr::str_extract(string, "([0-9]+)$")
    }
    else {
        exp <- 0
    }
    exp <- methods::new("Exp", coefficient=as.integer(coeff),
               variable=var,
               exponent=as.integer(exp))
}

So we can just call eveything a polynomial and just implement a method for this kind of object right now.

Let’s define some getters, to avoid errors.

##' @export
setGeneric("exponent", function(object, ...) {
    standardGeneric("exponent")
})
##' @export
setGeneric("variable", function(object, ...) {
    standardGeneric("variable")
})

##' @export
exponent_expression <- function(object, ...) {
    exp <- object@exponent

}
##' @export
exponent <- function(object, ...) {
    standardGeneric("exponent")
}

setMethod("exponent", signature(object="Exp"),
          definition=exponent_expression)

##' @export
variable_expression <- function(object, ...) {
    object@variable
}
setMethod("variable", signature(object="Exp"),
          definition=variable_expression)

##'Get the coefficients of an expression
##'
##'
##' @param object an expression object
##' @param ...
##' @export
coef_expression <- function(object, ...) {
    object@coefficient
}
##' A coefficient method for Expression objects
##'
##' As above
##' @export
setMethod("coef",
    signature(object = "Exp"),
    definition=coef_expression
)
derive_polynomial <- function(polynomial) {
    base <- polynomial@coefficient
    exp <- polynomial@exponent
    res <- diff_poly(base, exp)
}
setGeneric("differentiate", def=derive_polynomial)

So, the code works, but it produces weird looking expressions because it doesn’t automatically simplify the expressions.

That’s presumably what we should do next.

I also need to be able to actually provide numerical results.

Maybe give the differentiate function an argument to either be symbolic or not. I should just return the polynomial, and defer the printing done by the current function.

##' differentiate a expression object
##'
##' returns a new expression object
##' @title diff_expression
##' @param expression 
##' @return a new expression
##' @author richie
##' @export
diff_expression <- function(object, ...) {
    newxp <- object@exponent - 1
    newcoeff <- object@exponent * coef(object)
    var <- variable(object)
    res <-  methods::new("Exp",
                         coefficient=newcoeff,
                         variable=var,
                         exponent = newxp)

}
##' A generic to perform differentiation
##'
##' Works for expression objects right now
##' @export
setGeneric("differentiate", function(object, ...) {
    standardGeneric("differentiate")
})
##' @export
setMethod("differentiate", signature(object="Exp"),
          definition=diff_expression)
provide <- function(package) {
    if(!require(package)) {install.packages(deparse(substitute(package)))}
    else {
        library(package)
    }
}

This is just a utility that I often need. It generates warnings if put in a package though.

So, next I need to represent an equation, which consists of one or more Polynomial objects.

Equation

##' An S4 class representing an Polynomial object
##' @slot text a character object containing an equation
##' @slot members a list of polynomial objects
##'
##' See above
##' @export
setClass("Polynomial", representation = list(text="character", members="list"))
##' convert a string in polynomial form to an Equation object
##'
##' I really need to rename some of this stuff
##' @title as_polynomial
##' @param string an equation of the form cx^n+/-cx^n.., c
##' @return an equation object representing the 
##' @author richie
##' @export
polynomial <- function(string) {
    textlist <- unlist(expression_to_text(string))
    ops <- unlist(stringr::str_extract_all(string, "\\+|\\-"))
    polylist <- sapply(textlist, to_expression)
    eq <- methods::new("Polynomial", text=string, members=polylist,operators=ops)
    return(eq)
}
diff_polynomial <- function(eq) {
    #todo
}

So, the trouble with my equation class is that it loses the addition and subtraction operators. Not entirely sure how to handle this.

There are some options:

  • create operators which represent addition/subtraction
  • Add the information to the end of each polynomial.

Or I could just punt on it and hack together some gradient descent.

Gradient Descent ()

gradient_descent <-
    function(f, data, alpha=0.01, iterations) {
        reslist <- vector(mode="list", length=iterations)
        #this is magic
        gradient <- differentiate(f)
        for(i in seq_along(iterations)) {
            message("iteration: ", i)
            x <- x - alpha*gradient(x)
            reslist[[i]] <- x
        }
        
}

So, this looks nice. The only problem is that right now, my differentiate function isn’t going to work. So, now I need to handle the stuff I said I’d ignore above. Note: code may or may not have been shamelessly copied from Wikipedia.

Functions returning functions, oh my

So, right now we have an equation object, which consists of a text string describing the function, and the constituent polynomials. We need to convert this into a function which can be applied to the input data (i.e. guess).

expression_to_function <- function(expression) {
    return(function(x) {
        res <-   polynomial@coefficient  * x ^(polynomial@exponent)
    })}

So, that was easier than expected. It’s going to break unless I make some changes to my code though. I need to set the exponent value to 1, where it doesn’t exist. Currently, I believe it will take zero, which will cause incorrect answers.

Let’s make sure it doesn’t work.

poly_wrong <- to_polynomial("20x")
wrong_func <- polynomial_to_function(poly_wrong)
r <- wrong_func(1) #should be 20, will be zero
print(r)
var is: x
coeff is: 20
[1] 20

Hmmm, it appears that I was incorrect. Weird.

equation_to_function <- function(equation) {
    string <- equation@text
    diff <- lapply(equation@members, differentiate)
}

So, I should write some utility methods. S4 is strict and everything, but you can completely ignore all of the validity checks just by using @ [fn:1]. Luckily, I would never do that, and hence why I’m writing some extractor functions.

Everytime I write this boilerplate, I die a little.

Functions, more generally

Let’s re-write the polynomial_to_function a little more cleanly.

##' convert an expression object to a function over the variable(s)
##'
##' Right now only works for one variable functions
##' @title polynomial_to_function
##' @param polynomial 
##' @return a function which takes an argument x and computes the value of the function
##' @author richie
##' @export
expression_to_function <- function(expression) {
    if(exponent(expression)>=1) {
        return(function(x) {
            res <-   coef(expression)  * x ^(exponent(expression))
        })}
    else {
            return(function() {
                res <-  coef(expression) * 1 ^(exponent(expression))
            })

        }
}

We can now tangle this version into the package we’ve been building.

Tests

These are essential. I’m pretty sure that I’ve silently broken a bunch of functionality with all my casual renaming and changing of stuff. We should write some tests that actually specify behaviour so that we notice such breakages immediately.

require(testthat)
test_that("20x can be converted to an expression",
          expect_is(to_expression("20x"), "Exp"))
test_that("20x can be differentiated",{
    expect_is(differentiate(to_expression("20x")), "Exp")})

Cool, found a breakage. It appears that differentiate believes that the arguments of the expression are coefficient, when they are not.

Handling Operators for Polynomials

So, we have two major classes defined here. The first Expression represents a component of a Polynomial which consists of one or more expression objects expressed as an equation.

Currently, Polynomials look like this.

print(getClass ("Polynomial"))
Class "Polynomial" [package "polynomial"]

Slots:
                          
Name:       text   members
Class: character      list

What it needs is a third component, operators which contain a vector of addition/subtraction operators. These can then be used to recombine the elements (for printing), or to arrange the functions so that we can finally finish our gradient descent.

Polynomials, Redux

##' An S4 class representing an Polynomial object
##' @slot text a character object containing an equation
##' @slot members a list of polynomial objects
##'@slot operators a vector of additions/subtractions
##' See above
##' @export
setClass("Polynomial", representation = list(text="character", members="list", operators="character"))
##' convert a string in polynomial form to an Equation object
##'
##' I really need to rename some of this stuff
##' @title polynomial
##' @param string an equation of the form cx^n+/-cx^n.., c
##' @return an equation object representing the string
##' @author richie
##' @export
polynomial <- function(string) {
    textlist <- unlist(expression_to_text(string))
    ops <- unlist(stringr::str_extract_all(string, "\\+|\\-"))
    polylist <- sapply(textlist, to_expression)
    eq <- methods::new("Polynomial",
                       text=string,
                       members=polylist,
                       operators=ops)
    return(eq)
}
operators <- function(polynomial) {
    ops <- polynomial@operators
}
##' A function to differentiate polynomial objects
##'
##' See above
##' @title diff_polynomial
##' @param polynomial 
##' @return a function
##' @author richie
##' @export
diff_polynomial <- function(polynomial, value) {
    ops <- polynomial@operators
    members <- polynomial@members
    diff_members <- lapply(members, differentiate)
    diffed_func <- lapply(diff_members, expression_to_function)
    if(missing(value)) {
        value <-  0
    }
    myenv <- new.env
    myenv$x <- value
    res <- lapply(diffed_func, function (x) {eval(x, envir=myenv)}(value))
}
add_func <- function(f, g) {
    return(function() {
        f + g
    })
}
subtract_func <- function(f, g) {
    return(function()
           f - g)
}
combine_fns <- function(funcs, ops) {
    for(i in 1:length(ops)) {
        if(ops[i]=='+') {
            f <- add_func(funcs[i], funcs[i+1])
        }
        if (ops[i]=='-') {
            f <- subtract_func(f, funcs[i+1])
        }
    }
    return(f)
}

Combining functions is really hard.

I’m pretty sure the whole thing above isn’t going to work. However, given precedence rules, it actually doesn’t need to, as all of the sums can be done on the results of the expressions, rather than on the expressions themselves. This means I’ll need to keep around the operators along with the functions.

Footnotes

[fn:1] Of course. R’s not a monster, you know