Atom table limit hit if `riak admin` called regularly
Bob-The-Marauder opened this issue · 3 comments
One of our customers found an issue with KV 3.0.3 where the atom table kept becoming exhausted if riak admin
is called regularly e.g. polling riak admin status
for monitoring purposes. This was traced to a problem with relx in pre-OTP23 builds. We have filed the following PR erlware/relx#868
Here is a brief example where the atom count increases.
[root@localhost riak]# riak start
[root@localhost riak]# riak attach
Attaching to /tmp/erl_pipes/riak@127.0.0.1/erlang.pipe.1 (^D to exit)
(riak@127.0.0.1)1> erlang:system_info(atom_count).
52654
(riak@127.0.0.1)2> [Quit]
[root@localhost riak]# riak admin cluster status
---- Cluster Status ----
Ring ready: true
+--------------------+------+-------+-----+-------+
| node |status| avail |ring |pending|
+--------------------+------+-------+-----+-------+
| (C) riak@127.0.0.1 |valid | up |100.0| -- |
+--------------------+------+-------+-----+-------+
Key: (C) = Claimant; availability marked with '!' is unexpected
[root@localhost riak]# riak attach
Attaching to /tmp/erl_pipes/riak@127.0.0.1/erlang.pipe.1 (^D to exit)
(riak@127.0.0.1)2> erlang:system_info(atom_count).
52656
Although such a small increment should not really cause any issues, when riak admin status
is polled regularly 24 hours/day, it slowly adds up until you finally hit the 1 million atom mark and Riak crashes. Current work around is to restart Riak before the atom count gets too high.
Ahh interesting. I'm sure I've heard this same problem talked about before.
Sounds like it might be something along the lines of using list_to_atom/1 when creating a random maint shell name, which I think would occur everytime riak admin is called.
Ignore me, didn't read properly - see that it's already been dug into and the guilty code found and fixed.
We made these changes locally and, although there does seem to be some improvement, it does not fix the issue. We're currently trying to find the source of the issue.