CI on planet

Question

CI on planet

Opened this issue 2 years ago · 12 comments

wipfli commented 2 years ago

I rent a hetzner 12 core 128 gb dedicated host. Sometimes I do planet rendering runs but most of the time it sits idle.

We could use this machine for planet scale ci if you are interested, @msbarry.

Answer 1 · 2023-03-14T16:28:52.000Z

Cool thanks! What do you think would make the most sense? Run it nightly against main? Or try to run against branches?

Answer 2 · 2023-03-15T07:53:32.000Z

I think the instance could only handle on run at a time, and the openmaptiles profile would take roughly 6 hours to complete. So a nightly run could make sense based on the code in main. What do you think?

Answer 3 · 2023-03-15T17:45:02.000Z

I will see if I can set it up on my planetiler fork and if it works, I share the results here...

Answer 4 · 2023-03-16T20:48:01.000Z

OK that sounds good to me! Feel free to submit a PR, or let me know if I need to add any credentials to the repo to get this to work.

Answer 5 · 2023-03-17T19:34:34.000Z

Monaco on workflow_dispatch worked well. Now I am running it on the planet in my fork:

https://github.com/wipfli/planetiler/actions/runs/4450904601/jobs/7816916993

Let's see how that goes...

Answer 6 · 2023-03-17T21:09:54.000Z

Run out of disk space. Cleaned up some files and here is the new run: https://github.com/wipfli/planetiler/actions/runs/4450904601/jobs/7817607121

Answer 7 · 2023-03-18T15:33:39.000Z

After 4h36min yesterday's planet run on my github actions self hosted runner finished. See above link for details. The workflow uploads log.txt containing all the planetiler log output as an artifact.

What whould I do now? Should I check the the log file does not contain any warnings or errors?

Answer 8 · 2023-03-18T18:35:58.000Z

Hmm I guess what I'd be interested in is if it finishes successfully, the summary of timings at the end, and any warnings/errors it emits (except for maybe tile size warnings). This prints all of the warnings for me:

cat logs.txt | grep -E '^[:0-9]+ [^DI]' | grep -v uncompressed

And this is what I use to create a file with the summary when I test each branch:

cat logs.txt | sed -n '/^.*Tile stats/,$p' > branchsummary.txt
cat logs.txt | sed -n '/^.*Exception in thread/,$p' >> branchsummary.txt

I'm not sure what it would make sense to do with those logs though? Since it's not attached to a PR that it can post them to or anything...

Answer 9 · 2023-03-18T19:05:19.000Z

I am not sure what to do with them either. Since the compute is not sufficient for running on every push to main or even push to pull request, I think running it periodically is the best solution.

We could define some criteria for failure such that the workflow would exit as failed and create an alert notification. Some criteria for failure which come to mind are:

produces an error
produces a warning, or type of warning, or more warnings than usual
is significantly slower than usual

In general, I think it would be cool to have some automatic alerts for things like #511

Answer 10 · 2023-03-18T20:21:55.000Z

Hmm I wonder if it could create a new issue on warnings or failures, keep updating the issue each time it runs until you close it then open a new issue if it fails later?

Alternatively it could send an email or slack notification? Or if the action just fails will people subscribed to the repo get an email?

Answer 11 · 2023-03-18T20:30:03.000Z

When actions fail on my repos, I get a notification at github.com/notifications. You can subscribe to those by email but I found checking the github notifications page is easier.

I think it would be enough to get the notification from the failed workflow run.

Answer 12 · 2023-03-18T20:31:15.000Z

Sounds good to me!