A set of examples that show how to use the Readr API to connect with Readr Cloud. Also, scripts to download and process allenai's datasets.
The following has been tested on MacOS X with Scala 2.10.4, sbt 0.13, Spark 1.0.1.
You must have Apache spark installed in a directory if you would like to process and push new datasets to readr. Fetch spark at https://spark.apache.org/downloads.html
.
If you would like to run Kevin's preprocessing scripts for the wikipedia corpora, you must also install xml, gsed.
In conf/application.conf
set the user and password fields.
See the examples in src/main/scala/allenai/example
to see how to push patterns, fetch results, etc. You can run them as follows:
sbt "runMain allenai.example.Example3CreateFrameWithPattern"
sbt "runMain allenai.example.Example4FetchPatternMatches"
sbt "runMain allenai.example.Example5FetchPatternAnnotations"
-
Run
sbt compile
to see if you can fetch all dependencies. (Note: you must have your Allenai Nexus credentials set up for this to work.) -
Run
./run.sh
to download the barrons corpus, process the barrons corpus, and upload the indices to Readr Cloud.
Readr Cloud makes it easy to create, manipulate, and test extraction rules. When you are done using Readr Cloud you can fetch the rules and annotations you have created to store them locally. You can also write these back to Readr Cloud (to the same or a different Readr project). There are a bunch of different options:
./fetch_frames.sh
./push_frames.sh
sbt "runMain allenai.example.Example6FetchPatternAnnotations"
sbt "runMain allenai.example.Example7PutPatternAnnotations"
sbt "runMain allenai.example.Example8FetchAllMeaning"
sbt "runMain allenai.example.Example9PutAllMeaning"