Problem with chromote package
Opened this issue · 12 comments
Dear community,
I have created a shiny app to scrape the "google title" from the google page ("https://google.com"). To scrape this, I have used R chromote package. The app works fine while running on the desktop. However, once It is hosted on shinyapps.io server, two users can not use it concurrently. The code for the app is in below,
library(shiny)
library(curl)
library(chromote)
library(pagedown)
ui <- fluidPage(
textOutput("result")
)
server <- function(session, input, output) {
driver <- ChromoteSession$new()
driver$Page$navigate("https://google.com") # open Google page
Sys.sleep(7)
output$result <- renderText(
# scrape Google title
driver$Runtime$evaluate('document.querySelector("title").innerText')$result$value
)
}
shinyApp(ui = ui, server = server)
Output:
- Click on: https://sale4cast.shinyapps.io/findGoogleTitle/
- Wait 5 seconds.
- Get the google title "Google"
Question: How can two users access the app concurrently via shinyapps.io?.
Best Regards,
SaleForecast
I think the problem is the use of Sys.sleep()
. That will block the entire R process.
You should do something like this:
library(shiny)
library(chromote)
ui <- fluidPage(
textOutput("result")
)
server <- function(session, input, output) {
driver <- ChromoteSession$new()
p <- driver$Page$loadEventFired(wait_ = FALSE)
driver$Page$navigate("https://google.com", wait_ = FALSE)
output$result <- renderText({
p$then(function(value) {
# scrape Google title
driver$Runtime$evaluate('document.querySelector("title").innerText')$result$value
})
})
}
shinyApp(ui, server)
To properly navigate to a page and wait for it to load without blocking the R process, see this section of the README:
https://github.com/rstudio/chromote?tab=readme-ov-file#taking-a-screenshot-of-a-web-page
The example above also makes uses of Promises in Shiny. See here for more information:
https://rstudio.github.io/promises/articles/promises_06_shiny.html
@saleforecast1 Your app currently does work on shinyapps.io. Maybe your example doesn't completely reproduce your issue or I don't understand what you mean by "work" or "two users can not use it concurrently". But if I open https://sale4cast.shinyapps.io/findGoogleTitle/ in two different tabs or browsers, they both eventually (after about 7 seconds) show me the word "Google".
This question was also cross-posted to https://forum.posit.co/t/problem-with-chromote-package/186346
Have you tried the code that I provided? The problem is that your Sys.sleep()
blocks the entire process.
Oh, in that case, what Winston said is exactly right:
I think the problem is the use of
Sys.sleep()
. That will block the entire R process.
If you put Sys.sleep(7)
in your app, it causes your app to wait 7 seconds. Sys.sleep()
blocks R from doing anything until it finishes. If you open a second tab with the app while the first tab is processing, the second tab has to wait for the first user's app to finish loading, and then has to wait 7 more seconds.
Here's a simple diagram outlining the interaction.
sequenceDiagram
User 1->>+Shiny: Opens app
User 2-->Shiny: Opens app
Shiny-->>-User 1: responds after 7s
activate Shiny
Note over Shiny: starts user 2 request
Shiny-->>-User 2: responds after 7+s
To fix it please follow Winston's guidance:
To properly navigate to a page and wait for it to load without blocking the R process, see this section of the README: rstudio/chromote#taking-a-screenshot-of-a-web-page
The example above also makes uses of Promises in Shiny. See here for more information: rstudio.github.io/promises/articles/promises_06_shiny.html
library(shiny)
library(curl)
library(chromote)
library(pagedown)
ui <- fluidPage(
textOutput("result")
)
server <- function(session, input, output) {
driver <- ChromoteSession$new()
p <- driver$Page$loadEventFired(wait_ = FALSE)
driver$Page$navigate("https://google.com", wait_ = FALSE)
p$then(function(value){
googleSearchText <- "4 star hotel in barcelona"
driver$Runtime$evaluate(paste0('document.querySelector("textarea").value = "', googleSearchText,'"'))
driver$Runtime$evaluate('document.querySelector("input[aria-label=\'Google Search\']").click()')
})$then(function(value){
print(driver$Runtime$evaluate('document.querySelector("title").innerText'))
})
}
shinyApp(ui, server)
@wch can you say please why this code doesn't return the title? It returns an error "TypeError: Cannot read properties of null (reading 'innerText')\n at :1:32"
It sounds like the document.querySelector('title')
isn't returning anything.
I think the problem is that clicking on the search button causes another page load, and when you grab the <title>
in the middle of that page load, it might be happening too early, before there is a <title>
element.
I believe that you'll have to wait for another loadEventFired
inside of the promise chain.
library(shiny)
library(chromote)
ui <- fluidPage(
textOutput("result")
)
server <- function(session, input, output) {
driver <- ChromoteSession$new()
p <- driver$Page$loadEventFired(wait_ = FALSE)
driver$Page$navigate("https://google.com", wait_ = FALSE)
p$then(function(value){
googleSearchText <- "4 star hotel in barcelona"
p2 <- driver$Page$loadEventFired(wait_ = FALSE)
driver$Runtime$evaluate(paste0('document.querySelector("textarea").value = "', googleSearchText,'"'))
driver$Runtime$evaluate('document.querySelector("input[aria-label=\'Google Search\']").click()')
p2
})$then(function(value){
v <- driver$Runtime$evaluate('document.querySelector("title").innerText')
print(v)
})
}
shinyApp(ui, server)
Note that p2
is created inside the first $then()
function, and then it is returned from that function. The way that promises work, this means that the next function that's chained with $then()
will wait until that promise resolves before it runs. See the docs for the promises package for more information on how promises work. The API is very similar to JavaScript promises.
One other thing I want to mention: the code you started with uses a mix of sync and async programming, and calls to synchronous Chromote functions inside of asynchronous functions. It works in this case but might do unexpected things for more complicated code. It's probably best to stick to just async code for complex tasks, but that will require a good understanding of how these promises work.
Thanks for you response @wch. I really appreciate your answer and it works great. However, I still face an error when I run this app from multiple devices by shinyapps.io.
Error:
"Unhandled promise error: Chromote: timed out waiting for response to command Page.disable"
"Unhandled promise error: Chromote: timed out waiting for event Page.loadEventFired"
Code:
library(shiny)
library(curl)
library(chromote)
library(pagedown)
ui <- fluidPage(
tableOutput("result")
)
server <- function(session, input, output) {
driver <- ChromoteSession$new()
p <- driver$Page$loadEventFired(wait_ = FALSE)
driver$Page$navigate("https://google.com", wait_ = FALSE)
p$then(function(value){
googleSearchText <- "4 star hotel in barcelona"
p2 <- driver$Page$loadEventFired(wait_ = FALSE)
driver$Runtime$evaluate(paste0('document.querySelector("textarea").value = "', googleSearchText,'"'))
driver$Runtime$evaluate('document.querySelector("input[aria-label=\'Google Search\']").click()')
p2
})$then(function(value){
p3 <- driver$Page$loadEventFired(wait_ = FALSE)
driver$Runtime$evaluate('document.querySelector("div.R2w7Jd").click()')
driver$Runtime$evaluate('document.querySelector("div.JWXKNd").click()')
p3
})$then(function(value){
priceElement <- driver$Runtime$evaluate(
'var elements = document.querySelectorAll(".K1smNd > c-wiz[jsrenderer=\'hAbFdb\'] .PwV1Ac");
var elementPrices = [];
elements.forEach(function(element) {
elementPrices.push(element.innerText);
});
elementPrices.join("@");'
)
print(priceElement)
})
}
shinyApp(ui, server)
can you please help me for sort out the problem?
I don't know for sure, but my guess would be that there's not enough time between the two click()
commands in the block with p3
.
Thanks for your response @wch. can you please say, how to ensure enough time between two click()
event?