This repository contains the code samples demonstrated during the Toronto Elastic Meetup from July 24th 2024. The primary focus is to showcase the usage of the Elasticsearch Go client for indexing and searching XKCD comics.
XKCD - a webcomic of romance, sarcasm, math, and language
In this repository, you will find two main code samples in the cmd directory:
- Indexer: A Go program that fetches XKCD comics metadata and indexes them into an elasticsearch cluster.
- Searcher: A Go program that executes quries to pull stored XKCD comics from the Elasticsearch XKCD index.
These examples illustrate how to use the Elasticsearch Go client to efficiently index and search data.
To get started with the code samples, follow the instructions below.
If you have the time I highly recommend you watch The Go Language video introducing Go to the world.
Each XKCD can be fetched by appending the comic number to the base url
https://xkcd.com/{comic_number}
To get a json description of that comic, we append /info.0.json to the url
https://xkcd.com/{comic_number}/info.0.json
This URL will fetch the JSON for the 'Standards' comic
https://xkcd.com/927/info.0.json
and return the following
{
"month": "7",
"num": 927,
"link": "",
"year": "2011",
"news": "",
"safe_title": "Standards",
"transcript": "HOW STANDARDS PROLIFERATE\n(See: A\nC chargers, character encodings, instant messaging, etc.)\n\nSITUATION:\nThere are 14 competing standards.\n\nGeek: 14?! Ridiculous! We need to develop one universal standard that covers everyone's use cases.\nFellow Geek: Yeah!\n\nSoon:\nSITUATION:\nThere are 15 competing standards.\n\n{{Title text: Fortunately, the charging one has been solved now that we've all standardized on mini-USB. Or is it micro-USB? Shit.}}",
"alt": "Fortunately, the charging one has been solved now that we've all standardized on mini-USB. Or is it micro-USB? Shit.",
"img": "https://imgs.xkcd.com/comics/standards.png",
"title": "Standards",
"day": "20"
}
- Go (1.21+)
- Elasticsearch instance (You can use the official Docker image, or Elastic Cloud)
- An Elastic index to store the XKCD comics:
{
"mappings": {
"properties": {
"month": {
"type": "keyword"
},
"num": {
"type": "integer"
},
"link": {
"type": "keyword"
},
"year": {
"type": "keyword"
},
"news": {
"type": "text"
},
"safe_title": {
"type": "keyword"
},
"transcript": {
"type": "text"
},
"alt": {
"type": "text"
},
"img": {
"type": "keyword"
},
"title": {
"type": "text"
},
"day": {
"type": "keyword"
}
}
}
}
-
Clone the repository:
git clone https://github.com/jeremyforan/bf-meetup-elastic-presentation-july24-2024.git cd bf-meetup-elastic-presentation-july24-2024
-
Install the required Go packages:
go mod tidy
The indexer fetches XKCD comics and indexes them into Elasticsearch. It configures an Elasticsearch client and uses goroutines to download comics asynchronously. The comics are stored in a thread-safe structure. After downloading, the comics are indexed into Elasticsearch using a bulk indexer, which flushes data to Elasticsearch periodically. The program logs progress and errors throughout.
The Searcher program queries an Elasticsearch index for XKCD comics and loops through the results. It configures an Elasticsearch client, and executes a search query for XKCD comics. The response is decoded, and logs each comic, such as ID, score, alt text, day, news, number, and title.
This project is licensed under the MIT License. See the LICENSE file for details.
Happy coding! If you have any questions or need further assistance, feel free to open an issue or contact the repository maintainers.
And join the Elasticsearch Slack '#meetup-toronto' channel
Maintainers:
- Jeremy Foran (@jeremyforan)