gosearcher is a small golang package which provides a number of functions that make it easy to scrape a number of popular search engines including Google, Yandex and soon Bing. The package supports the use proxies and multi-page scraping and allows users to scrape multiple pages of search results. The library relies on the popular Goquery package.
package main
import (
"fmt"
"github.com/EdmundMartin/gosearcher"
)
func main() {
res, err := gosearcher.GoogleScrape("Edmund Martin", "com", "en", nil, 1, 10, 10)
if err == nil {
for _, res := range res {
fmt.Println(res)
}
}
}
- searchTerm - string
- countryCode - string - Will return an error if country is not supported by Google. "com" - will use Google.com
- languageCode - string - The language used to search - in the format of an ISO 639-1 Code
- proxyString - empty interface - The proxy (string format) you wish to use for the particular scrape, or nil to scrape without a proxy
- pages - int - The number of pages you wish to scrape.
- count - int - The number of results per page - multiples of 10 up to 100.
- backoff - int - The time to wait in between scraping pages, if more than one page of results is being scraped.
package main
import (
"fmt"
"github.com/EdmundMartin/gosearcher"
)
func main() {
res, err := gosearcher.YandexScrape("Привет меня зовут", "com", "10393", nil, 1, 30, 20)
if err == nil {
for _, res := range res {
fmt.Println(res)
}
} else {
fmt.Println(err)
}
}
- searchTerm - string
- countryCode - string - Will return an error if country is not supported by Yandex. "com" - will use Yandex.com
- location - empty interface - Yandex's location code, can be a string or will use Moscow as base if nil is based. Full list can be found here.
- proxyString - empty interface - The proxy (string format) you wish to use for the particular scrape, or nil to scrape without a proxy
- pages - int - The number of pages you wish to scrape.
- count - int - The number of results per page - multiples of 10 up to 30.
- backoff - int - The time to wait in between scraping pages, if more than one page of results is being scraped.
package main
import (
"fmt"
"github.com/EdmundMartin/gosearcher"
)
func main() {
res, err := gosearcher.BingScrape("Edmund Martin", "com", nil, 2, 30, 30)
if err == nil {
for _, res := range res {
fmt.Println(res)
}
} else {
fmt.Println(err)
}
}
- searchTerm - string
- countryCode - string - Will return an error if country is not supported.
- proxyString - empty interface - The proxy (string format) you wish to use for the particular scrape, or nil to scrape without a proxy
- pages - int - The number of pages you wish to scrape.
- count - int - The number of results per page - multiples of 10 up to 50.
- backoff - int - The time to wait in between scraping pages, if more than one page of results is being scraped.
type SearchResult struct {
ResultRank int
ResultURL string
ResultTitle string
ResultDesc string
}
All supported search engines return a slice of SearchResult. This struct contains the rank, url, title and description of the particular result in question.