Elasticsearch as a hard requirement
vorpalhex opened this issue ยท 17 comments
Currently ES is listed in the tech stack. ES is a great tool and it's absolutely best in class for it's purpose of enabling powerful search, but as someone who has maintained a whole mess of ES clusters, it also comes with a lot of cost and complexity.
ES requires multiple nodes, it's authentication mechanisms are expensive and require enterprise licensing and there are limited SAAS hosts available. In addition, it adds significant hosting complexity (shard rebalances, kibana hosting, etc) and makes development and tests very complex, even with Docker.
If one of the goals of chapter is to be easy and free-ish for small orgs to host, then ES shouldn't be a hard requirement. It can be an upgrade to enable more powerful search for dedicated hosts, but we shouldn't design against ES as a requirement. There are several other ways to enable meaningfully powerful search, including a solid full text search capability in Postgres.
We can lower the development cliff to be involved significantly, and help enable small scale hosting of instances dramatically by leveraging alternatives, without compromising MVP features.
@vorpalhex thanks for your input! This is definitely feedback to consider when making final decisions on the stack.
Here are a couple of good articles on postgres and search:
I'm sure people know, but to be explicit you can buy ES as a service from AWS. This way you pay with money, not with work:
https://aws.amazon.com/elasticsearch-service/
Here's a fun read on Amazon Elasticsearch - AWS Elasticsearch: a fundamentally-flawed offering
-1 on ES as a hard requirement as well, particularly for MVP. While it's easy enough to grab the Docker images for ES, include it in Compose, and get a dev setup up and running, doing the same in an environment that's visible from the Internet is a bit more fraught/easier to get wrong. Or you pay AWS to do it, hope you get IAM roles/VPCs right, and trade some problems for others.
Versus having the fuzzy searching etc. fun that ES provides as an optional add-on, where the base infrastructure requirement for the app is "an app server that runs node" and "a database server that runs Postgres"...AKA fewer things to break :)
github.com/valeriansaliou/sonic might actually be a decent alternative to elasticsearch, given it's simplicity, the low bar for search (meetup.com pretty much only does tags and order-independent single word matching), and 'functions on a potato' resource requirements.
I agree with @jackbravo and @vorpalhex. Postgres is already a decided part of the stack and it often doesn't get enough credit for what it can handle, we should try utilize it until we outgrow it. Adding something like ES just increases upfront complexity.
I think one of the express pros of ES is that it can be used to search across all instances of chapter (see #33) - I'm not experienced enough with search tools to know how this plays in, but I think we should consider this when choosing a search tool.
Even at the scale of tens of thousands of organizations I don't think ES buys that much for searching when compared to Postgres, esp. if the main concern is just searching organization name or location.
I'd also like to point out this comment about Postgres being over kill if it's a self hosted solution: #54 (comment)
If the intention of the project is to bundle up a simple, one-click container install then is it even possible to include ES? The OP said it's costly and complicated. If the reason we're building a distributed app is because Meetup isn't free then it seems like anything besides a hosting cost is counter to the project.
I'm not experienced with Postgres, but folks I know in town that use it won't stop talking about all the magic it can do, so presumably it has a solid search capability.
I also think that ES is good for searching across chapters worldwide but not necessary for local meetup groups. Most of the time Postgres is enough to support a web app.
I also think using https://github.com/valeriansaliou/sonic could be a great alternative since is lightweight / easier & cheaper to host.
I agree with others, and think it would probably be good to wait until there are enough things to search for using ES, before adding it.
I think ultimately having it is a good longer term goal, and I've been having similar ideas about having an ES instance that allows searching across all groups worldwide (like a Google for chapters almost). But (and keeping in mind I haven't checked how much data is in the app yet if any), I don't feel like ES really brings all that much if there are only like 20 records to search through.
a more simple stack
If a relational database is already being used, it might be interesting to reduce the stack by one component less (no elastic), I am not familiar with full text search on mysql or postgresql, so I can't compare with elastic other then comparisons from online docs.
free auth plugin for elastic
If needed, there is an authentication plugin for free for elastic (or you can write your auth (or whatever) own plugins in java in any case). https://readonlyrest.com/free/
ease of use
We have been using ElasticDB for a year, I think in this case the total ELK stack is not needed, if you only want elastic DB (search feature via http-REST), (so no Kibana and no Logstash,.. etc etc),
We also written our on management scripts (bash/sed) for elastic, I like the fact, no driver is needed, you just talk to elastic via HTTP Rest.
I think everyone is on-board with ES not being part of an MVP / hard requirement. I've posted on #47 and unless a good reason surfaces to keep this open then we'll close it.
Thanks for everything you all have shared here - especially @vorpalhex for broaching the subject and @jackbravo for the links.
You all have successfully swayed me to the "let's stick with Postgres for now" approach.
If it becomes clear that we need something more powerful later on, we can consider adding ElasticSearch. But for our MVP we shouldn't include it.
We'll want to remove Elasticsearch from the README. I'll try to submit a PR that updates the docs with the most recent changes.