This Node.js script fetches and parses sitemaps from a given URL. It supports nested sitemaps and outputs the sitemap hierarchy along with associated URLs. The result can be saved to a text file.
- axios used to make an HTTP request.
- cheerio to manipulate HTML/XML documents
- getSitemaps is a recursive function till we get no nested sitemaps, for which we extract the URL for current sitemap.
- It checks if there are nested sitemaps by looking for elements.
- If there are no nested sitemaps, it extracts the URLs from the current sitemap using
- Node.js installed on your machine
- cd sitemap
- npm install
- node sitemap.js
- sitemap_output.txt(sample output file) file that contains all the sitemaps and their associated URLs, grouped together.