help debugging... tau was working great but suddenly stopped
rexroof opened this issue · 6 comments
hello! really love this software package. I've been using it since early march with a really simple golang client to execute scripts based on events.
I wasn't pulling any updates but a week or two ago it stopped working. I was able to trigger local test events but real events coming from twitch weren't coming through.
I'm running the whole thing with a local docker-compose setup using your yaml file.
I usually start the software just before my stream and stop it when I'm finished.
after it stopped working I pulled down the latest updates. I started with a fresh .env file based on the example and updated the values. I also wiped my local db volume to start from scratch. And I also registered a new twitch chat application to see if that could be it.
is there a way to enable debug logs so that I can see the code register webhooks?
is it possible that I have too many webhooks registered on my account and need to de-register some?
any suggestions for debugging?
small aside, what is the /streamers section in the webui meant to be used for?
thanks for this project, it's really useful!
I'm glad you've found TAU useful thus far. Sorry to hear it suddenly stopped working. Due to supervisord being used within the docker container to run both the worker service as well as the server, it is a little bit tricky to get at the logs (This is definitely something I am looking to make easier- if there are any supervisord experts out there, please feel free to chime in). You can try the following to get access to the wsworker logs:
- after running
docker-compose up
, open a new terminal window. If you rundocker container ls
you should see atau-app
container running. - Next you want to exec into the
tau-app
container to get a bash shell prompt. This can be done by:docker exec -it tau-app bash
. You should see a bash prompt that looks something like:root@83bfad802c02:/code#
- The logs are currently in /tmp. You are want to look at a set of log files that look something like:
server-stdout---supervisor-VFXCET.log
,server-stderr---supervisor-7B64Vb.log
,wsworker-stderr---supervisor-ytv8FB.log
,wsworker-stdout---supervisor-UXOdZi.log
. If you are having issues with the incoming connections from twitch, thewsworker
files are the ones you'll want to look at. - Feel free to post anything that looks off in those files.
To answer your question- yes, there is a limit to the number of webhooks and websocket connections you can have open. However, each time you fire up the TAU container, it should be clearing out all old webhooks (and the websocket connections are automatically closed when the TAU container shuts down).
The /streamers page is some new code intended for people who want to write things like discord bots, where they want notifications of multiple streamers going live/offline. I'll be adding some documentation on how to use it soon.
Thanks for using TAU and bearing with me as some of these early bugs get ironed out! Let me know if you find anything in the logs (or if you dont find anything). The other possibility to look at, would be if your ngrok tunnel is not opening properly. That should show up in those wsworker logs as well.
DOH! The logs lead me down the path to fixing my setup. I hadn't logged into the web ui again after resetting the DB, so I needed to finish setting up my user and my auth token. thanks!
running multiple services in a container under supervisord is a bit of an anti-pattern. Ideally each of those processes would be in their own container instances, even if they were copies of the same container image. that would help expose all of your logs. let your container executor manage the processes instead of supervisord.
Also, I'd remove the container_name directives from your docker-compose file unless there is some reason they are required. it makes it confusing to run docker-compose commands that target specific services.
Glad you got it sorted out!
Regarding multiple services in one container. I have struggled with its anti-pattern-ness a bit, and when run locally, it really isn't an issue to just add a separate worker container, and I likely will do so soon. However, where I struggle is for those who want to run this using a container hosting service, such as Render, Vultr, or Azure Containers. In this case, you are charged per container you spin up, so limiting the number of containers is more cost effective. Thus, I've got a separate single-container build, that uses sqlite instead of postgres, and runs Redis, the worker service, and the server all in a single container. Definitely not ideal, but it makes it possible to easily and cost effectively deploy to the cloud. If you have any ideas on how to do this in a less anti-pattern way, I am all ears.
There was a reason I used container_name directives, but at the moment I can't recall what it was. I'll definitely look into that a bit more as well! Thanks for the feedback, and please feel free to chime in if you have any more problems. (I'm going to close this issue as your problem seems to be resolved, but feel free to post any comments regarding the above)
One other comment- Thinking about it a bit, I believe I know why your original install stopped working, and it was likely due to too many webhooks. In the first couple commits (which would have been around the time you gave TAU a try) I had a bug in the code that was deleting webhooks, and they weren't all being removed. So eventually, you would run out of available webhooks, and an issue like you were describing would happen. This is now fixed, so it shouldn't happen in the future.
thats great to know. seriously, thanks a lot!
Here is one way to consolidate all of the supervisor logs so that they appear on the docker container's stdout:
https://stackoverflow.com/a/21371113
Thanks! I'll definitely implement that very soon!