ycphs/openxlsx

Saving HTML tables (rvest) as Excel files

Mkranj opened this issue · 0 comments

Mkranj commented

I'm downloading a certain HTML table using the rvest package. Currently, I'm transforming it to a regular dataframe and then saving it as .xlsx. However, the table in question has a lot of merged cells. When transforming to a dataframe, all the spaces a merged cell occupies get filled with its text, leading to many duplicates.
Is there a way to directly save a HTML table as an Excel file? Since Excel and openxlsx support merged cells, this would lead to a true-to-original output. I believe this would be a very useful feature :)
From what I've tried, the rvest table is in a xml_node format.

Thanks for the great work!