How to check for status code?
mjenczmyk opened this issue · 3 comments
Hi,
Is there any way to check for status code when making a HTTP request? For instance, when executing
browser.post(loginURL, Map(
"email" -> email,
"password" -> password,
"op" -> "Login",
"form_build_id" -> getLoginFormID,
"form_id" -> "packt_user_login_form"
))
how can one check whether status code is 200 (successful logging) or not? One way would be to write a custom HtmlValidator
to validate returned Document
, but can I do it explicitly by checking status code?
Hi @mjenczmyk! The behavior of a Browser
when a request returns a non-200 status code is implementation-dependent. In the case of jsoup, for example, an HttpStatusException
is thrown:
scala> JsoupBrowser().get("https://example.com/non_existing_page")
org.jsoup.HttpStatusException: HTTP error fetching URL
at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:760)
at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:706)
at org.jsoup.helper.HttpConnection.execute(HttpConnection.java:299)
(...)
You can catch this exception to check the specific status code that was returned.
Note that scala-scraper does not intend to provide full-fledged HTTP clients. While the Browser
class surely helps in the most common use cases, dealing with more complex HTTP responses may require using a proper external HTTP client and passing the response body to Browser#parseString
manually.
Thanks, that'll help. I was using scalaj-http
when making more complex requests (as you've suggested), but now I'll be able to use plain scala-scraper
more. Thanks for help!
Feel free to close the issue if you want to.
For me changing the user agent to "Mozilla/5.0" alone fixed the issue.
Document doc = Jsoup.connect(url)
.userAgent("Mozilla/5.0")
.timeout(30000)
.get();