antiword
Extract Text from Microsoft Word Documents
Wraps the AntiWord utility to extract text from
Microsoft Word documents. The utility only supports the old doc
format, not the
new xml based docx
format.
Installation
devtools::install_github("ropensci/antiword")
Hello World
The function has only a single function antiword()
. It takes either a local
file path or a URL to a word document:
library(antiword)
text <- antiword("https://jeroen.github.io/files/UDHR-english.doc")
cat(text)