Error when trying to view session (or other methods)
Closed this issue · 2 comments
Trying to run the example from rvest page below (as well as other examples and my own code). I am using Version 0.2.0 of chromote and 1.0.4 of rvest on a mac.
library(rvest)
sess <- read_html_live("https://www.forbes.com/top-colleges/")
sess$session$is_active()
sess$view()
rows <- sess %>% html_elements(".TopColleges2023_tableRow__BYOSU")
rows %>% html_element(".TopColleges2023_organizationName__J1lEV") %>% html_text()
rows %>% html_element(".grant-aid") %>% html_text()
The sess object seems to create just fine and the session is active. However, the sess$view (and the next line as well) errors out with the following message
Error in new_chromote && !self$session$is_active() :
invalid 'x' type in 'x && y'
2. private$check_active()
1. sess$view()
looking through live.R there is the following code which I think means new_chromate is the invalid x
in the error.
check_active = function() {
if (new_chromote && !self$session$is_active()) {
suppressMessages({
self$session <- self$session$respawn()
private$root_id <- self$session$DOM$getDocument(0)$root$nodeId
})
}
}
In zzz.R there is
new_chromote <- NULL
.onLoad <- function(...) {
if (is_installed("chromote")) {
new_chromote <<- utils::packageVersion("chromote") >= "0.1.2.9000"
} else {
# If chromote is not installed yet, assume it's not new to be safe.
new_chromote <- FALSE
}
invisible()
}
If I run new_chromote <<- utils::packageVersion("chromote") >= "0.1.2.9000"
in the console it returns TRUE.
If I run sess$session$is_active()
in the code chunk it is TRUE.
What am I missing that I am getting the error or is there a bug.
Are you still experiencing this problem? I can't see how it could come about 😞
Thanks so much for checking. I not looked at this in a while since chromate session was working. However as of today, the problem has gone away. Not sure why.
Thanks!
OBTW, The css in the example in the vignette is out of date but that was not the issue. The two examples work (as of today...)
library(rvest)
sess <- read_html_live("https://www.forbes.com/top-colleges/")
sess$session$is_active()
# sess$view()
my_table <- html_element(sess,".ListTable_listTable__-N5U5") |> html_table()
rows <- sess %>% html_elements(".ListTable_tableRow__P838D") |> html_text()
# To get data on all 100 movies on the IMDB web page
sess <- read_html_live("https://www.imdb.com/list/ls055592025/")
sess$session$is_active()
# sess$view()
sess$scroll_by(10000,0)
sess$scroll_by(10000,0)
sess$scroll_by(10000,0)
sess$scroll_by(10000,0)
sess$get_scroll_position()
sess |>
html_elements(css = ".dli-parent") ->
imdb_100_elements
For completeness, the second example with IMDB works fine interactively, but if rendering, one has to first use httr::user_agent()
to set a user agent string other than Null to avoid being blocked by IMDB as a bot. It does not really matter which agent string
# Define User-Agent string
user_agent_str <- "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.3"
#or
user_agent_str <- "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.0.3 Safari/605.1.15"
# Set global User-Agent
ua <- httr::user_agent(user_agent_str)