[Request] Add robots.txt parsing
joshua-bn opened this issue · 3 comments
joshua-bn commented
Would be nice to have the ability to parse robots.txt like RSS feeds. $web->robots
https://github.com/bopoda/robots-txt-parser is a library. Not sure if it is the one to use here but it seems to do the job
spekulatius commented
Yeah, that's something to consider. I would opt for https://github.com/spatie/robots-txt instead as it's better maintained. What exactly do you want to achieve with the information?
joshua-bn commented
Personally, I am looking for sitemaps declared in robots.txt but I think there's also value in checking for rules for crawling.
spekulatius commented
Fair enough, that's definitely another use-case. I'll see how we can get
both working
…On Thu, Jan 12, 2023, 15:58 Joshua Dickerson ***@***.***> wrote:
Personally, I am looking for sitemaps declared in robots.txt but I think
there's also value in checking for rules for crawling.
—
Reply to this email directly, view it on GitHub
<#177 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACAK7M45YFZADMOUK6LOEHLWSALZZANCNFSM6AAAAAATW5RGTE>
.
You are receiving this because you commented.Message ID:
***@***.***>