amundsen-io/amundsen

Would like a guide for How-To deploy Amundsen in production

jornh opened this issue · 23 comments

jornh commented

Please add points on what you expect from such a guide in a comment below. I will then try to consolidate input and draft up an outline in this comment.

The guide can end up as /docs/deployment.md is /docs/owners_manual.md better?

Initial outline:

AWS could be common for deployment, possibly using https://aws.amazon.com/ecs/?

jornh commented

Neo4j backup and restore

Install the Neo4j APOC plugin (in a folder next to your example/docker/neo4j/conf/)

	mkdir example/docker/neo4j/plugins
	pushd example/docker/neo4j/plugins
	wget https://github.com/neo4j-contrib/neo4j-apoc-procedures/releases/download/3.3.0.4/apoc-3.3.0.4-all.jar
	popd
	mkdir example/backup

Add volumes for plugins + backup in amundsen-docker.yml:

	      volumes:
	          - ./example/docker/neo4j/conf:/conf
	          - ./example/docker/neo4j/plugins:/plugins
	          - ./example/backup:/backup  
	

Start containers,

Docker-compose -f docker-amundsen.yml up

ingest data via Databuilder

In the Amundsen frontend web, change descriptions. Maybe add owners…

In the Neo4j web console

CALL apoc.export.cypher.schema('/backup/amundsen_schema.cypher')
CALL apoc.export.graphml.all('/backup/amundsen_data.graphml', {useTypes: true, readLabels: true})

Delete the Neo4j graph (still in the Neo4j web console):

MATCH (n)
DETACH DELETE n

Restore the backup (yep, you guessed it, still in the Neo4j console) :

CALL apoc.import.graphml('/backup/amundsen_data.graphml', {useTypes: true, readLabels: true})

ToDo:

  • Figure out where CLI/cron job should live: as part of metadata - as shell/cron (wrap in airflow) - as Databuilder - as Airflow Operator
  • Test volume add works - does not break for non-existing plugin/backup in repo (or add KeepFolder file)
  • Check under what circumstances restore of Schema is needed

Related: #196 and slack thread with some script snippets etc

jornh commented

@ttannis we lost access to the useful content of former FE issue https://github.com/lyft/amundsenfrontendlibrary/issues/186 referenced in the snippet shown below.

Can that content be salvaged somehow? E.g. will transferring https://github.com/lyft/amundsenfrontendlibrary/issues/186 in a closed state to here do it?

Basic install of services (in different environments)
Docker-compose “vanilla”, but with Gunicorn, data in volumes etc.
AWS (ECS PR): lyft/amundsenfrontendlibrary#216 (or EC2): lyft/amundsenfrontendlibrary#186
Kubernetes (convert from Compose using https://kompose.io?)

Transferred that closed issue over: #77

jornh commented

Thanks for the quick turnaround on this @ttannis - seems to work nicely!

Also please extend my thanks to other Lyft team members on the recent even higher systematic focus on grooming PRs etc. I think going forward that will really encourage more to hopefully contribute even more!

@jornh I have amundsen on aws eks + k8s + helm now; I will put up a PR next week with docs; I'm not sure if it will fully fulfill this story, or, if I should put up another one. wdyt?

jornh commented

Great @javamonkey79! I think it should definitely tick the Kubernetes box above (I edited a bit above).

Just push what you think is suitable to cover Kubernetes on it's own and we'll figure the rest out later, when there's some good pieces of content it's easy to shuffle around afterwards if needed.

Right now I'm thinking the list above should end up as just a jump list or "annotated ToC" for what the sys-admin would like/need to know. Haven't really figured out how much or little prose will be needed to glue it together... Thoughts are welcome! 😜

Ok @jornh I've got the PR up here; I've opted to not include the aws setup at this point, as it is tied to our org a bit. I might add it later, if there is enough interest. cc @markgrover @feng-tao

Great suggestions here! I'd like to emphasize the need for a more explicit documentation on how to set up Airflow to handle ingestions of ES after Neo4J editions. From an outsider perspective, it remains quite a mystery , although Airflow (or something filling this function) is clearly a 4th microservice indispensable for the other 3 to work.

jornh commented

@fBedecarrats Airflow has its own documentation. So we’ll probably just reference that.

But the gist of it is:

  1. Setup Airflow depending on how “serious business” this is for you it ranges from:
  • just pip install it on a box where it can run, probably in a Python virtual environment of its own for good measure.
  • install it in a container based setup - including a separate postgres/other database - a popular/easy Docker image til now has been “puckel” (just google “Airflow puckel” and you see it). but just recently the Apache Airflow project itself are starting to make their own official image. Not even sure if it’s still “beta”.
    • there are btw also the option to get Airflow as a SaaS solution from Astronomer - or Google
  1. pip install amundsendatabuilder and other required dependencies (database drivers) on top of your Airflow
  2. add your DAGs (the databuilder PyPi package doesn’t include the examples folder from the repo
  3. When you upgrade make sure to keep databuilder and your Amundsen services in sync regarding compatible versions.

@jornh Thanks a ton for putting those instructions together - I'm currently investigating how to implement Amundsen and backup / restore was high on the list.

Is there a good way to have ElasticSearch re-index data that was restored into Neo4j? I'm getting search errors after a Neo4j database restore even though I see the expected data post-restore in the Neo4j console.

I found that I can re-run the amundsendatabuilder job on the same data source and the my restored data appears on the FE again, but that seems like a hackjob.

jornh commented

It’s merely a wishlist 🙂 (with links to “state of the union” - but luckily bit by bit I can tick boxes) glad to hear the list is useful to someone. So, thanks for your comment.

To answer your question: Elasticsearch and Amundsensearch doesn’t have a will of their own on what data to serve. So what you call a hack with re-ingesting reindexing through Databuilder is actually the way to update ES data. I think for a, hopefully rare, restore scenario that’s okay. Hope that clarifies...

Do you have ideas for a different way?

@jornh I have amundsen on aws eks + k8s + helm now; I will put up a PR next week with docs; I'm not sure if it will fully fulfill this story, or, if I should put up another one. wdyt?

I'd be interested in the helm chart.

jornh commented

@stewartbryson see https://www.amundsen.io/amundsen/k8s_install/ + the Amundsen Slack also has a #kube-helm channel for discussion

Thanks for the clarification, @jornh - if that's the best way to go about a restore scenario, then that works for us. :)

I honestly don't have any other ideas; I barely have the skillset to implement Amundsen, much less understand the inner workings 😅. Again, really appreciate the help and your documentation!

Hi, we have been trying to stand up Amundsen on Kubernetes but can't get the pod for Neo4j to deploy... Did anyone else have this problem?

I'm going to pick this up. I think this will be a nontrivial project, mostly in the form of soliciting feedback from the community. Part of the appeal of Amundsen is its flexibility: there's no one right way to install it. However, for a guide to be broadly useful, I believe it needs to have concrete steps. As a result, we'll need to make some opinionated decisions in order for the guide to be useful.

Here's how I'm planning on structuring this project:

  1. Create a skeleton of docs following @jornh's already-excellent outline. I will fill some of the "easier" details, and will leave anything nontrivial with as specific TODO as I can. I will solicit community feedback on this doc in a PR. I'd like to land it into a feature branch.
  2. Based on feedback in (1), I will modify structure if needed. Additionally, I will fill in nearly all of the TODOs, including ready-to-run commands. I will open another PR and invite another round of community feedback (now that there is more substance to disagree with 😄 )
  3. Once I address the feedback from (2), I will ask for one final round of feedback. In particular, I'd like to get at least one community member to run through all of the instructions command-by-command to ensure that it actually does what it says on the tin. At this point, I would like to merge it to mainline and promote the guide on the main readme. It will not replace installation.md (that guide is appropriate for someone who is just trying to get the thing working without source control or customizations), but instead will supplant it

If anyone has thoughts about this process, happy to hear.

There's some question as to which docs should be in the top repo vs service repos. My only strong feeling is that there be a single top-level doc that one can follow and find everything they need. Procedurally, it's much easier to make changes to the docs if they're all in one repo, rather than scattered between them. And given that the individual components aren't super useful when used independently, I default to just putting it into the larger repo. Open to feedback.

jornh commented

@dorianj that sounds like an awesome plan! I'll refrain from giving more feedback until you have passed step 1. 😉

hey -- we've packaged some of the learnings from this thread and other places into a recommended pathway https://medium.com/stemma/amundsen-deployment-best-practices-740a1800518e -- would love anyone who's worked through this stuff to try it out and give feedback, we'd like to eventually get this upstreamed into main repo once it's better battle tested

Hi, we finally decided to start working with Apache Atlas. I guess we'll consider later adopting Amundsen as an alternative front-end.

Does anyone use Ansible roles for deploying and managing Amundsen ? I could share mine if that is of any interest (on-premise compose installation).

@dorianj , could you Guide me on the installation of Amundsen without Docker ? docker being paid for the commercial use or require enterprise license would take the benefits of open source usage for the enterprises.

Any suggestions. Appreciate support here.

A year passed. Still even is not clear how to make auth.