/go-sitemap

Stream decoders for sitemap.xml data and link feeds.

Primary LanguageGoMIT LicenseMIT

go.dpb.io/sitemap

Stream decoders for sitemap.xml data and link feeds.

  • supports multiple file formats - traditional sitemap.xml files, syndication feeds (Atom, RSS), and plain text files
  • supports extensions - xhtml:link localizations, Google-specific (image sitemaps and Google News sitemap)
  • supports stream parsing (vs parsing all records into memory)
  • utilities for resolving relative URLs, ignoring invalid locations, and fetching directly from a server

Usage

Import the base packages...

import (
  "go.dpb.io/sitemap"
  "go.dpb.io/sitemap/data"
)

Create an EntryCallback for processing each Entry as it is decoded...

callback := data.EntryCallbackFunc(func (entry data.Entry) error {
  if entryURL, ok := e.(*data.URL); ok {
    fmt.Println(entryURL.Location)
  }

  return nil
})

Use the default decoder which will auto-detect the encoding and supports some common extensions...

err := sitemap.Decode(reader, callback)

Learn more from the examples directory, test files, and code documentation.

Alternatives

License

MIT License