bmschmidt/wordVectors

Using formulas in `closest_to()` may result in an error (R 4.2.0)

Closed this issue · 4 comments

Using formulas in closest_to() can result in the error

Error in if (class(object) == "VectorSpaceModel") { : 
  the condition has length > 1

This is an error starting in R v4.2.0 (currently the latest version): “Calling if() or while() with a condition of length greater than one gives an error rather than a warning.”

Here’s a short script that demonstrates the problem:

library(magrittr)
library(wordVectors)

w2vModel <- read.vectors('https://github.com/NEU-DSG/word-vector-interface/raw/main/data/eebo.bin')

# Works
w2vModel %>% closest_to('conduct')
# Works
w2vModel %>% closest_to(~'conduct'-'manners')
# Doesn't work
w2vModel %>% closest_to(~'conduct'+'manners', 20)
# Works
w2vModel %>% closest_to(project(w2vModel[['conduct']], w2vModel[['manners']]), 20)

The error description suggests that square_magnitudes() in matrixFunctions.R might be the problem, but I’m not sure.

Here’s the error with traceback:

Error in if (class(object) == "VectorSpaceModel") { : 
the condition has length > 1
5.
square_magnitudes(y)
4.
tcrossprod(square_magnitudes(x), square_magnitudes(y))
3.
cosineSimilarity(matrix, vector)
2.
closest_to(., ~"conduct" + "manners")
1.
w2vModel %>% closest_to(~"conduct" + "manners")

Since the problem occurs when using the formula shorthand, I’m thinking that something’s gone wrong in sub_out_formula() (called by cosineSimilarity()). It’s not failing, but its output ultimately causes the error in square_magnitudes().

Taking the error at face value it looks like I may just have used a shorthand for testing class inheritance that is no longer allowed--let me take a quick look.

My guess is that the thing is a matrix, and this change in 4.2 is breaking against another separate change from the R 4.0 release notes, since this ran without warnings in R 3.0 versions.

matrix objects now also inherit from class "array", so e.g., class(diag(1)) is c("matrix", "array"). This invalidates code incorrectly assuming that class(matrix_obj)) has length one.

I have pushed a patch change that looks only at the first class element.

I have R 4.1 only on my local machine, if you have the free time to reinstall and check the patch on 4.2 much appreciated.

I'm also happy to give anyone at Northeastern DSG push access to this repo because I haven't had occasion to use it myself in a couple years.

@bmschmidt, reinstalled and yep, you got it! Thanks so much!

The WWP might be able to take on some maintenance work in the future; we use and teach with this package a lot. I’ll talk to Julia and Sarah.