pythonhacker/harvestman-crawler
Automatically exported from code.google.com/p/harvestman-crawler
Python
Issues
- 3
import errors for module hashlib
#33 opened by GoogleCodeExporter - 0
Depth of any url
#34 opened by GoogleCodeExporter - 1
- 0
- 0
- 0
No _logger attribute
#31 opened by GoogleCodeExporter - 0
Implement -nd option of wget
#32 opened by GoogleCodeExporter - 1
- 0
ImportError: No module named _bsddb
#30 opened by GoogleCodeExporter - 2
- 3
- 10
Permaloop?
#25 opened by GoogleCodeExporter - 0
Use all functionality of setup.py
#26 opened by GoogleCodeExporter - 8
- 1
- 4
Install error in x86_64
#22 opened by GoogleCodeExporter - 4
Error crawling url's containing non latin-1 characters: reported containing fatal errors
#21 opened by GoogleCodeExporter - 19
Error crawling sites containing characters with encoding standards different than Latin-1
#20 opened by GoogleCodeExporter - 7
Scale crawler to a client/server design aiming for full distributed system support
#18 opened by GoogleCodeExporter - 1
Error: "I/O operation on closed file" when running the crawler on the same site twice.
#19 opened by GoogleCodeExporter - 1
Design the crawler to run non-stop
#16 opened by GoogleCodeExporter - 2
Design the crawler to run non-stop
#17 opened by GoogleCodeExporter - 5
HTML code reconstruction library to be added optionally - beautifullsoup for example
#15 opened by GoogleCodeExporter - 12
Memory consumption optimization
#13 opened by GoogleCodeExporter - 1
Parallel crawl of projects
#14 opened by GoogleCodeExporter - 4
Modify logging to confirm to standards
#12 opened by GoogleCodeExporter - 4
RSS Integration
#10 opened by GoogleCodeExporter - 2
Killing harvestman with "ctrl+C"
#11 opened by GoogleCodeExporter - 5
Scheduling options in command-line
#9 opened by GoogleCodeExporter - 11
- 3
- 2
- 5
- 3
Crawler strategy classes
#4 opened by GoogleCodeExporter - 1
- 13
virtualenv setup error
#2 opened by GoogleCodeExporter - 5