Issues
- 3
Publish spider CLI binaries
#217 opened by alexkreidler - 1
Panic with non ASCII string
#216 opened by ronanM - 5
Help wanted: Reduce memory footprint
#204 opened by Falumpaset - 2
It's colly not crolly
#214 opened by melroy89 - 0
Retrieve crawled markdown via API
#211 opened by culda - 2
Broadcast never end when scraping with limit
#210 opened by DimitriTimoz - 8
Memory leak caused by hashbrown
#207 opened by DimitriTimoz - 4
support file:// urls
#197 opened by jmikedupont2 - 0
Memory leak
#208 opened by DimitriTimoz - 1
Scrape with smart mode
#206 opened by DimitriTimoz - 1
Retrieve response cookies
#202 opened by viktorholk - 2
- 1
Store referring links
#199 opened by LeoDog896 - 1
Running the example code results in an error
#198 opened by haijd - 0
Command spider_cli: Short option names must be unique for each argument, but '-u' is in use by both 'url' and 'user_agent'
#195 opened by jmikedupont2 - 4
CLI: download files as they arrive?
#192 opened by gjtorikian - 1
- 6
robots.txt files are not being respected correctly
#184 opened by div72 - 1
Can transform work properly?
#190 opened by ybsun0215 - 1
Budget not respected
#187 opened by CrazyDubya - 3
Add DEPTH level next to each debug line [ENHANCEMENT]
#185 opened by Zabrane - 4
Support COOKIE during the crawl [ENHANCEMENT]
#186 opened by Zabrane - 11
Prebuilt binaries for Linux, macOS
#183 opened by Zabrane - 20
- 6
Is it possible to extract broken links from the crawl?
#175 opened by metsis - 3
Already crawled URL attempted as % encoded
#172 opened by apsaltis - 1
Running with decentralized feature
#171 opened by zmedelis - 7
Is it possible to dynamicall add links to crawl?
#170 opened by oiwn - 1
Chrome flag chrome_intercept page hang.
#168 opened by j-mendez - 17
- 11
Some pages have 0 bytes from scraped page. After rerunning, different pages have 0 bytes
#165 opened by esemeniuc - 4
Support ignoring SSL errors
#162 opened by superkelvint - 8
Extracting all urls on a page
#160 opened by apsaltis - 2
Scraping timeout Issue
#158 opened by virajk31 - 4
- 3
- 2
`with_on_link_find_callback` doesn't exist
#145 opened by SamuelMarks - 1
Extract text from Html
#141 opened by MihirModi1421 - 4
only let me spider one url
#138 opened by sebs - 1
cli parameters
#139 opened by sebs - 2
cli tutorial store crawls result as json
#134 opened by sebs - 4
Getting URL after redirect
#127 opened by joksas - 1
error[E0061]: this function takes 2 arguments but 1 argument was supplied
#136 opened by roniemartinez - 1
Add the ability to download not only html, but also all site assets: css, js, imgs, etc
#132 opened by namen3645 - 6
full-resource feature seems to be missing Javascript
#130 opened by Byter09 - 2
Blacklist regex for CLI does not seem to work
#129 opened by Byter09 - 6
Change API to builder pattern
#115 opened by roniemartinez - 5
- 3
- 11