MazamaScience/MazamaCoreUtils

html_getTables() function

Closed this issue · 0 comments

Similar to the html_getLinks() function, we should have a html_getTables() function with the following function signature:

html_getTables <- function(
  url = NULL
)

Code in MazamaSpatialUtils::convertWikipediaTimezoneTable.R has a good example:

  url <- "http://en.wikipedia.org/wiki/List_of_tz_database_time_zones"

  ...

  # Get the raw html from the url
  wikiDoc <- xml2::read_html(url)

  # Get a list of tables in the document
  tables <- rvest::html_nodes(wikiDoc, "table")

It may in fact be that easy! (Or perhaps not.)

This function should return a list of tables found on the web page.

You should also create a wrapper function for getting a particular table. This will live in the same source file and share a documentation page:

html_getTable <- function(
  url = NULL,
  index = 1
)

The documentation should have some examples that exercise these functions.