spatie/robots-txt

Determine if a page may be crawled from robots.txt, robots meta tags and robot headers

PHPMIT

Issues

Discussion: Would it make sense to replace file_get_contents with guzzle
#40 opened a year ago by ivangrozni
3
file_get_contents($source) throws an InvalidArgumentException on Websites with expired Certificates
#34 opened 2 years ago by osthafen
1
Not working properly with: x-robots-tag: none
#30 opened 3 years ago by nnerijuss
3
Several user-agent for one Disallow directive does not work
#28 opened 4 years ago by Kixell-NicolasJardillier
5
Can't create `Robots` from string, must use url or file
#24 opened 4 years ago by rezen
1
Custom UserAgent mismatches due to parseUserAgent()
#25 opened 4 years ago by muhci
4
Out of Memory exception on large pieces of HTML
#19 opened 5 years ago by mattiasgeniar
8
Implement Allow directive
#18 opened 5 years ago by BenMorel
3
False positives on bare domains with no trailing slash
#15 opened 5 years ago by mikemike
1
Fixes "case-insensitive"-Rule in X-robots-tag
#10 opened 5 years ago by RobinDev
1
Fix nofollow or noindex check for Headers
#11 opened 5 years ago by RobinDev
1
Fix Wildcard check in Robots.txt and Headers
#12 opened 5 years ago by RobinDev
1
Robots.txt fields should be case-insensitive
#8 opened 6 years ago by Redominus
2
Support robot headers
#1 opened 7 years ago by brendt
2