Issues
- 1
decrease log level in fail method of LPASpout
#86 opened by wowasa - 2
Number of unchecked links remains unchanged
#87 opened by wowasa - 1
ack- and fail-Method of LPASpout never called
#82 opened by wowasa - 2
Log frequently the number of unchecked links
#84 opened by wowasa - 2
Spout logs much too high number of processed links
#81 opened by wowasa - 1
Maven test execution doesn't work
#80 opened by wowasa - 1
After storm crawler depency upgrade the Clarin specific http.user.agent is not accepted anymore
#78 opened by wowasa - 2
Upgrading apache storm and crawler dependency
#77 opened by wowasa - 1
- 0
FetcherThread never shutting down
#76 opened by wowasa - 3
- 1
Close Stream in LPASpout
#74 opened by wowasa - 2
Adding individual crawl delay
#69 opened by wowasa - 3
Save latest checking results in file
#66 opened by wowasa - 1
Purging configuration file from redundant settings
#67 opened by wowasa - 2
Many Links with message 'Crawl delay too long.'
#72 opened by wowasa - 1
make property http.robots.agents configurable
#46 opened by wowasa - 1
log "outsized" crawl delays
#49 opened by wowasa - 2
- 1
Tuples never acknowledged
#71 opened by wowasa - 1
- 1
Investigation on failures
#65 opened by wowasa - 1
Improving procedure on failure
#55 opened by wowasa - 1
Revieweing StatusUpdaterBolt
#68 opened by wowasa - 12
Investigation on SSLHandshakeException
#52 opened by wowasa - 1
Review processing chain for redundant steps
#60 opened by wowasa - 1
Some URLs are checked too oftenly
#61 opened by wowasa - 1
- 1
Upgrade storm-crawler-core dependency
#53 opened by wowasa - 2
LPASpout is loosing Url instances
#58 opened by wowasa - 2
Adding flag/unflag operation for URLs in process
#56 opened by wowasa - 5
- 3
SQLException in persistence of linkchecker status
#54 opened by wowasa - 1
Missing category and message in case of malformed URL
#50 opened by wowasa - 1
- 1
GET request on large downloads fails
#42 opened by wowasa - 1
- 1
Setting property HTTP_AGENT_VERSION automatically
#45 opened by wowasa - 1
Add unit tests to the project
#31 opened by wowasa - 1
- 2
Review MetricsFetcherBolt
#37 opened by wowasa - 1
Upgrade storm and storm crawler
#38 opened by wowasa - 1
content-type not set in database
#36 opened by wowasa - 1
- 1
- 1
- 1
NumberFormatException in StatusUpdaterBolt
#29 opened by wowasa - 4
Robots.txt not respected (for archive.mpi.nl)
#25 opened by twagoo - 1
head/get switch doesn't respect crawler delays
#26 opened by wowasa - 0
Use the checker for VCR
#24 opened by dietervu