vezaynk/Sitemap-Generator-Crawler
PHP script to recursively crawl websites and generate a sitemap. Zero dependencies.
PHPMIT
Issues
- 1
How to setup page I need ?
#101 opened by nethiker76 - 1
- 2
Add a log with links, contains url returns 404
#99 opened by oim37 - 8
[!] Reformatted site from https://example.com/folder/#with_index.php# to https://example.com/
#96 opened by martianmaikel - 6
Sitemap limits checking is missing
#43 opened - 3
- 5
Fails for Maximum allowed length
#91 opened by canuck-sailor - 2
"mmap failed... cannot allocate memory"
#88 opened by tomjennings - 1
What if my website has more then 50,000 pages
#95 opened by Jaydeep-01 - 2
Could not find files for the given pattern(s)
#93 opened by Jaydeep-01 - 1
Reports empty error after finishing
#89 opened by mylselgan - 4
- 0
Specify for IP local sitemap generation
#94 opened by vezaynk - 3
Option to not crawl Frames
#79 opened by kewh - 1
Pls help me
#92 opened by sirstevemedia - 1
[Feature Request] List all kinds of files & add an option to not try to curl some extensions
#87 opened by jean-christophe-manciot - 12
"noindex" URL are listed in sitemap
#82 opened by stephanros - 2
- 7
only creating one URL where I have 700+
#85 opened by tejalbaria5 - 4
sitemap.xml access denied can't open
#81 opened by Ganofins - 0
Empty sitemap close
#80 opened by MattMski - 0
Last modified date always set to current date/time
#76 opened by kewh - 1
<iframe> tags ignored
#78 opened by kewh - 2
Some things added
#73 opened by brunoabcn - 1
Little problems with &
#75 opened by webapteka - 1
Please move the project away from GitHub
#74 opened - 4
Crawling the root website instead of sub-root
#72 opened by yashitgarg - 1
Adds only the main page :(
#71 opened by webapteka - 4
Why not a plugin?
#50 opened by rsm23 - 9
- 1
Switch from arrays to hashtables
#68 opened by vezaynk - 1
Tracking deferred scans
#67 opened by vezaynk - 1
banned pagination url
#64 opened by jazuly1 - 9
Blacklist not working
#66 opened by Kristiansky - 2
- 20
- 1
Output file permissions
#60 opened by vezaynk - 18
Non-ascii urls fail to validate
#57 opened by mazux - 0
- 1
Whitelist GET arguments
#31 opened by vezaynk - 1
HTML base element
#46 opened by Thyra - 9
Dropping pound
#48 opened - 0
Index PDFs
#40 opened by vezaynk - 0
Canonical URL
#47 opened - 5
Entity escaping is missing
#42 opened - 9
Standard output
#36 opened - 15
Indefinite loop
#34 opened - 2
Double entry
#35 opened - 2
Follow frames
#39 opened by Thyra - 0
Domain root links append a needless /
#33 opened by vezaynk