rsdoiel/extractor-js
Since I originally wrote this a module called request has come on the scene. You might want to try that before mucking about with extractor-js. A small NodeJS package using jsDom to facilite screen scraping and spidering. It scrapes single and multiple elements and includes support for many tag attributes.
HTMLBSD-2-Clause
Issues
- 1
User agent request headers?
#24 opened by danscan - 1
timer isn't released with making an http request.
#22 opened by rsdoiel - 1
Spider should include basic beta data
#21 opened by rsdoiel - 1
Turn clustered spider into an exported function.
#17 opened by rsdoiel - 1
Missing Robots awareness
#18 opened by rsdoiel - 1
- 1
Why not extractor on Mikael's request module?
#20 opened by rsdoiel - 0
- 0
- 0
Add a timeout setting for fetchPage
#9 opened by thom4parisot - 0
prep-list for npm update on next version
#12 opened by rsdoiel - 1
Not passing automated tests
#13 opened by rsdoiel - 1
- 3
selectors syntax is too narrow, needs to work like a real querySelector() or querySelectorAll()
#6 opened by rsdoiel - 0
Spider function missing
#4 opened by rsdoiel - 1
Documentation and Examples
#3 opened by rsdoiel - 2
Removed dependency on jQuery
#5 opened by rsdoiel - 2
Needs an npm manifest
#1 opened by rsdoiel - 1
Test coverage
#2 opened by rsdoiel