tribbloid/spookystuff
Scalable query engine for web scrapping/data mashup/acceptance QA, powered by Apache Spark
ScalaApache-2.0
Issues
- 1
- 15
Sample example which does not works
#53 opened by fabiofumarola - 0
Which one is the stable branch?
#61 opened by fahadsiddiqui - 5
spokystuff in lambda context
#58 opened by webroboteu - 5
- 1
scalor plugin
#57 opened by Andrei-Pozolotin - 3
NullPointerException generated
#55 opened by fahadsiddiqui - 1
- 1
Create a gitter channel
#52 opened by aparo - 1
xpath selector in page parsing and extraction
#15 opened by tribbloid - 2
Link Extractors
#47 opened by austinprete - 0
- 2
Downloads on github.io page don't work and SBT can't find SpookyStuff in Maven
#44 opened by austinprete - 0
css selector that contains double quotes cannot be resolved by jsoup properly in Zeppelin (and maybe other interpreters)
#43 opened by tribbloid - 4
- 2
- 2
kernel_spec deprecated in IPython 3.0.0
#39 opened by titipata - 1
- 4
- 0
assertion of 2 accumulators in Integration test throws exception sporatically
#35 opened by tribbloid - 0
- 0
- 0
Integration test - wget html, image & pdf file
#19 opened by tribbloid - 0
- 5
Integration test - wget - loop, paginate, loadmore
#21 opened by tribbloid - 0
Integration test - explore, paginate by explore
#23 opened by tribbloid - 4
Build failed with MaxPermSize=512MB
#8 opened by cartershanklin - 0
API freeze?
#30 opened by tribbloid - 1
Integration test - join, leftjoin
#22 opened by tribbloid - 1
Integration test - maven plugin & configuration
#18 opened by tribbloid - 2
Change SpookyContext root from s3 to user local
#26 opened by titipata - 10
- 3
- 1
- 0
Integration test - test website
#17 opened by tribbloid - 0
Spark Certificate: Fill out a general questionnaire about the product (e.g., integration dependencies, product information, versions of Spark supported)
#16 opened by tribbloid - 2
- 2
Cannot start REPL locally
#11 opened by feribg - 2
test trello connector
#12 opened by l2yao - 0
- 3
Poll: java pattern or scala pattern?
#5 opened by tribbloid - 0
Documentation, documentation, documentation
#7 opened by tribbloid - 1
- 1
PhantomJS binary won't quit if application is terminated when invoking it WebDriver
#4 opened by tribbloid - 0
ghostdriver.get followed by immediate .content will save unloaded/partially-loaded page
#2 opened by tribbloid