Robots.txt parser?
asg017 opened this issue · 0 comments
asg017 commented
Not sure if it makes sense in a HTTP library, maybe a separate extension/project...
https://en.wikipedia.org/wiki/Robots_exclusion_standard
https://github.com/google/robotstxt
https://pkg.go.dev/github.com/jimsmart/grobotstxt
select i, line, column, value from robotstxt_comments(readfile('robots.txt'));
select i, line, name from robotstxt_sitemaps(readfile('robots.txt'));
select
user_agent, -- idk
type, -- fix typos, lowercased
key, -- raw value, allow, disallow, etc.
value
from robotstxt_each(readfile('robots.txt'));
select robotstxt_agent_allowed(
readfile('robots.txt'),
'MyRobot',
'http://example.net/members/index.html'
);