emqx/emqx-auth-redis

query some data failed in cluster mode

sekfung opened this issue · 12 comments

Environment

  • OS: macOS 10.15.7
  • Erlang/OTP: Erlang/OTP 22
  • EMQ: v4.2.3

Description

I had modified emqx_auth_redis.conf file to support redis cluster. But when I recompiled emqx-rel and load the plugin, I got these error.

[error] <<"1912426236|securemode=3,timestamp=1072916711,signmethod=hmacsha1|">>@127.0.0.1:54918 [Hooks] Failed to execute {fun emqx_auth_redis:check/3,[#{auth_cmd => "HMGET mqtt_user:%u password",hash_type => plain,pool => emqx_auth_redis,super_cmd => "HGET mqtt_user:%u is_superuser",timeout => infinity,type => cluster}]}: {badarg,[{erlang,element,[1921,undefined],[]},{eredis_cluster_monitor,get_pool_by_slot,2,[{file,"eredis_cluster_monitor.erl"},{line,70}]},{eredis_cluster,query,4,[{file,"eredis_cluster.erl"},{line,116}]},{emqx_auth_redis,check,3,[{file,"emqx_auth_redis.erl"},{line,40}]},{emqx_hooks,safe_execute,2,[{file,"emqx_hooks.erl"},{line,164}]},{emqx_hooks,do_run_fold,3,[{file,"emqx_hooks.erl"},{line,143}]},{emqx_access_control,authenticate,1,[{file,"emqx_access_control.erl"},{line,77}]},{emqx_channel,auth_connect,2,[{file,"emqx_channel.erl"},{line,1181}]}]}

Looking forward to your help. Thx.

Hi @sekfung This looks like a bug inside the program. Could you share the emqx_auth_redis.conf detail?

Of course. @HJianBo
I only modified the values of auth.redis.type ,auth.redis.server and the auth.redis.password

##--------------------------------------------------------------------
## Redis Auth/ACL Plugin
##--------------------------------------------------------------------
## Redis Server cluster type
## single    Single redis server
## sentinel  Redis cluster through sentinel
## cluster   Redis through cluster
auth.redis.type = cluster

## Redis server address.
##
## Value: Port | IP:Port
##
## Single Redis Server: 127.0.0.1:6379, localhost:6379
## Redis Sentinel: 127.0.0.1:26379,127.0.0.2:26379,127.0.0.3:26379
## Redis Cluster: 127.0.0.1:6379,127.0.0.2:6379,127.0.0.3:6379
auth.redis.server = 192.168.1.64:7001,192.168.1.64:7002,192.168.1.64:7003,192.168.1.64:7004,192.168.1.64:7005,192.168.1.64:7006

## Redis sentinel cluster name.
##
## Value: String
## auth.redis.sentinel = mymaster

## Redis pool size.
##
## Value: Number
auth.redis.pool = 8

## Redis database no.
##
## Value: Number
auth.redis.database = 0

## Redis password.
##
## Value: String
auth.redis.password = 123456

## Redis query timeout
##
## Value: Duration
## auth.redis.query_timeout = 5s

## Authentication query command.
##
## Value: Redis cmd
##
## Variables:
##  - %u: username
##  - %c: clientid
##  - %C: common name of client TLS cert
##  - %d: subject of client TLS cert
##
## Examples:
##  - HGET mqtt_user:%u password
##  - HMGET mqtt_user:%u password
##  - HMGET mqtt_user:%u password salt
auth.redis.auth_cmd = HMGET mqtt_user:%u password

## Password hash.
##
## Value: plain | md5 | sha | sha256 | bcrypt
auth.redis.password_hash = plain

## sha256 with salt prefix
## auth.redis.password_hash = salt,sha256

## sha256 with salt suffix
## auth.redis.password_hash = sha256,salt

## bcrypt with salt prefix
## auth.redis.password_hash = salt,bcrypt

## pbkdf2 with macfun iterations dklen
## macfun: md4, md5, ripemd160, sha, sha224, sha256, sha384, sha512
## auth.redis.password_hash = pbkdf2,sha256,1000,20

## Superuser query command.
##
## Value: Redis cmd
##
## Variables:
##  - %u: username
##  - %c: clientid
##  - %C: common name of client TLS cert
##  - %d: subject of client TLS cert
auth.redis.super_cmd = HGET mqtt_user:%u is_superuser

## ACL query command.
##
## Value: Redis cmd
##
## Variables:
##  - %u: username
##  - %c: clientid
auth.redis.acl_cmd = HGETALL mqtt_acl:%u

## Redis ssl configuration.
##
## Value: on | off
#auth.redis.ssl = off

## CA certificate.
##
## Value: File
#auth.redis.cafile = path/to/your/cafile

## Client ssl certificate.
##
## Value: File
#auth.redis.certfile = path/to/your/certfile

## Client ssl keyfile.
##
## Value: File
#auth.redis.keyfile = path/to/your/keyfile

Maybe the list of servers is not reachable... Could you check it?

@HJianBo the servers are reachable. It can works on my other project.

I guest If servers are not reachable, the plugin can not be loaded

Yes, it really should be. But I just checked and found out that it doesn't cause the plugin to fail to start!

It's bad behavior. We try to improve in the later version

OK. Maybe the issue title is misleading. The plugin can be successful to start. the bug occurs on query some data.
Anyway, Thanks for your help.

@HJianBo Maybe I can provide more detail.

When the plugin query some data, I print the State in get_pool_by_slot func of eredis_cluster_monitor module
https://github.com/emqx/eredis_cluster/blob/61eedf2eb66dbfaec9b2b4eb7605a252f3d44daa/src/eredis_cluster_monitor.erl#L70

get_pool_by_slot(Slot, State) when is_integer(Slot) ->
    io:format("get_pool_by_slot, Slot(~p)\nState(~p)\n", [Slot, State]),
    Index = element(Slot+1,State#state.slots),
    Cluster = element(Index,State#state.slots_maps),
    if
        Cluster#slots_map.node =/= undefined ->
            {Cluster#slots_map.node#node.pool, State#state.version};
        true ->
            {undefined, State#state.version}
    end;

Here is the log

get_pool_by_slot, Slot(1920)
State({state,[{node,"192.168.1.64",7001,undefined},
              {node,"192.168.1.64",7002,undefined},
              {node,"192.168.1.64",7003,undefined},
              {node,"192.168.1.64",7004,undefined},
              {node,"192.168.1.64",7005,undefined},
              {node,"192.168.1.64",7006,undefined}],
             undefined,{},1,emqx_auth_redis,0,"123456",8,0,[]})

I found that the slots and the node pool both are undefined. So when the program try to get an element from undefined, it happens.

I don't know why the slots are undefined. I master erlang language poorly. If you can provide some help about my problem, I would appreciate it.

Perhaps we can keep track of the redis link and see if it works. After started emqx, you can:

bin/emqx attach

1> dbg:tracer(),dbg:p(all,call).
2> dbg:tp(eredis_client,x).

Then try to restart the emqx_auth_redis plugin

Here is the log

(emqx@127.0.0.1)1> dbg:tracer(),dbg:p(all,call).
{ok,[{matched,'emqx@127.0.0.1',320}]}
(emqx@127.0.0.1)2> dbg:tp(eredis_client,x).
{ok,[{matched,'emqx@127.0.0.1',13},{saved,x}]}
(emqx@127.0.0.1)3> (<0.1790.0>) call eredis_client:module_info(attributes)
(<0.1790.0>) returned from eredis_client:module_info/1 -> [{behaviour,
[gen_server]},
{vsn,"4.2.3"}]
(<0.1790.0>) call eredis_client:module_info(attributes)
(<0.1790.0>) returned from eredis_client:module_info/1 -> [{behaviour,
[gen_server]},
{vsn,"4.2.3"}]
(<0.1790.0>) call eredis_client:module_info(attributes)
(<0.1790.0>) returned from eredis_client:module_info/1 -> [{behaviour,
[gen_server]},
{vsn,"4.2.3"}]
2020-11-27 15:13:44.277 [error] <<"1912426236|securemode=3,timestamp=1072916711,signmethod=hmacsha1|">>@127.0.0.1:49701 [Hooks] Failed to execute {fun emqx_auth_redis:check/3,[#{auth_cmd => "HMGET mqtt_user:%u password",hash_type => plain,pool => emqx_auth_redis,super_cmd => "HGET mqtt_user:%u is_superuser",timeout => infinity,type => cluster}]}: {{case_clause,[]},[{eredis_cluster_monitor,get_state,1,[{file,"eredis_cluster_monitor.erl"},{line,49}]},{eredis_cluster_monitor,get_pool_by_slot,2,[{file,"eredis_cluster_monitor.erl"},{line,79}]},{eredis_cluster,query,4,[{file,"eredis_cluster.erl"},{line,116}]},{emqx_auth_redis,check,3,[{file,"emqx_auth_redis.erl"},{line,40}]},{emqx_hooks,safe_execute,2,[{file,"emqx_hooks.erl"},{line,164}]},{emqx_hooks,do_run_fold,3,[{file,"emqx_hooks.erl"},{line,143}]},{emqx_access_control,authenticate,1,[{file,"emqx_access_control.erl"},{line,77}]},{emqx_channel,auth_connect,2,[{file,"emqx_channel.erl"},{line,1181}]}]}

And it seems like it's another problem... amazing!!!

Fine... I rollback emqx to v4.2.0. It works...