BuilderIO/gpt-crawler

Crawl a site to generate knowledge files to create your own custom GPT from a URL

TypeScriptISC

Issues

Got
#391 opened 17 days ago by Bezo81
0
error TS2322
#151 opened a year ago by sc0h0
6
cookie example
#158 opened a year ago by cmbcbe
1
How do I keep links to pictures and videos in my web pages?
#156 opened a year ago by he1771129657
0
Scrape HTML tag
#183 opened 3 months ago by WebdevWebi
0
Cookies not accepted
#145 opened a year ago by stevenbaert
2
Hi
#181 opened 4 months ago by chihebhichri67
0
`exclude` Pattern in `config.ts` Not Working as Expected
#179 opened 5 months ago by ChronicleCoder
0
Proxy Bug https://github.com/BuilderIO/gpt-crawler/pull/170
#174 opened 7 months ago by frei-x
2
match mid part of path
#173 opened 7 months ago by SmallDryad
0
Multiple match patterns
#172 opened 7 months ago by qqaatw
1
Crawl a Github repo
#171 opened 8 months ago by mena234
1
GPT Crawler cli Drop-in config
#168 opened 9 months ago by zerofill
2
'Zod' package not found |
#123 opened a year ago by Daethyra
3
Does gpt-crawler server always return same site?
#147 opened a year ago by kaibadash
1
Output.json file not created
#152 opened a year ago by bourgeda
1
Script Not Crawling Subdirectories During Website Scraping
#166 opened 9 months ago by AhsanAk
0
how to crawler this site?match not work
#159 opened a year ago by tom6q6
1
Type Error
#165 opened 10 months ago by JoelWekesa
0
Doesnt Extract Texts present in dropboxes
#164 opened a year ago by KumarSampurn
0
memory usage need to be optimized
#163 opened a year ago by banditsmile
0
PlaywrightCrawler memory problem and errors
#162 opened a year ago by lipstk
0
sh: cross-env: command not found
#161 opened a year ago by Vickie-Liu
0
add a username and password？
#160 opened a year ago by GentleLemon
0
extracting text in hidden div blocks
#157 opened a year ago by udgithub
0
How to supply read-able code to the GPT?
#155 opened a year ago by lucastobrazil
0
Can i use Gemini model by google?
#150 opened a year ago by amrpyt
0
How to crawl Single Page Application(SPA)
#149 opened a year ago by ouyh1111
0
Multiple websites at once?
#135 opened a year ago by heyfletch
11
Multiple Selectors not Reflected in Output
#146 opened a year ago by mahdii0908
0
Crawl websites protected by username and password?
#136 opened a year ago by alzh666
1
How to crawl https://zod.dev/ ?
#124 opened a year ago by MontiL
2
WARN PlaywrightCrawler: Reclaiming failed request back to the list or queue. Request blocked - received 429 status code.
#137 opened a year ago by Voyager3D
2
Json too large for GPT
#113 opened a year ago by tristan-mcinnis
7
Trying to Crawl site nothing working
#139 opened a year ago by upup666
1
add a method to evaluate the quality of the retrieved context
#143 opened a year ago by shrjain1312
0
Only one tag html for all the page
#128 opened a year ago by Th3Heavy
2
Add for a selector to exclude elements in the site
#134 opened a year ago by razaanstha
1
npm start issue
#127 opened a year ago by diandian0420
3
Add support for concurrent invocations to crawl
#120 opened a year ago by adityak74
3
how to add userName & passwd to gpt-crawler
#141 opened a year ago by DorakuCN
0
ERROR PlaywrightCrawler: Request failed and reached maximum retries. Navigation timed out after 60 seconds
#140 opened a year ago by Mytraas
0
Disallowed Special Token
#130 opened a year ago by MrAshRhodes
2
Extracting Articles from Sequential Pages of a Website
#131 opened a year ago by nicofierrov
0
aisa gpt
#125 opened a year ago by AZURALIF06
0
How to paginate for large JSON files?
#119 opened a year ago by chnsh
0
Crawling more than max number of pages
#118 opened a year ago by dcgleason
0
Query on Configuring Multiple Web Pages with Unique config.ts Files
#117 opened a year ago by PipeDream941
0
some html like https://www.abc.com/file/abc.pdf couldn't be crawled
#115 opened a year ago by cuterv
0
How to limit the hierarchy of pages to be crawled?
#112 opened a year ago by wywywy1990
0