gluster/glusto-tests

ERROR level logs should reflect only real errors

BarakSason opened this issue · 2 comments

There are several cases in which logs are logged with ERROR level, but don't necessarily reflect real errors.
E.g:

2020-08-09 12:16:22,630 INFO (run) root@vm1 (cp): gluster volume info | egrep "^Brick[0-9]+" | grep -v "ss_brick"
2020-08-09 12:16:22,630 DEBUG (_get_ssh_connection) Retrieved connection from cache: root@vm1
2020-08-09 12:16:22,743 INFO (_log_results) �[34;1mRETCODE (root@vm1): 1�[0m
2020-08-09 12:16:22,744 INFO (_log_results) �[31;1mSTDERR (root@vm1)...
No volumes present
�[0m
2020-08-09 12:16:22,744 ERROR (get_servers_used_bricks_dict) error in getting bricklist using gluster v info
2020-08-09 12:16:22,744 INFO (run) root@vm1 (cp): gluster volume info testvol_distributed --xml
2020-08-09 12:16:22,744 DEBUG (_get_ssh_connection) Retrieved connection from cache: root@vm1
2020-08-09 12:16:22,917 INFO (_log_results) �[34;1mRETCODE (root@vm1): 0�[0m
2020-08-09 12:16:22,918 DEBUG (_log_results) �[47;30;1mSTDOUT (root@vm1)...

This log doesn't terminate the run, but will actually trigger volume creation (some lines later from the same log):

2020-08-09 12:16:22,918 INFO (run) root@vm1 (cp): gluster volume create testvol_distributed        vm1:/mnt/disk2/testvol_distributed_brick0 vm2:/mnt/disk2/testvol_distributed_brick1 vm3:/mnt/disk2/testvol_distributed_brick2 --mode=script force
...
2020-08-09 12:16:26,261 INFO (run) root@vm1 (cp): gluster vol status testvol_distributed  
2020-08-09 12:16:26,261 DEBUG (_get_ssh_connection) Retrieved connection from cache: root@vm1
2020-08-09 12:16:26,373 INFO (_log_results) �[34;1mRETCODE (root@vm1): 0�[0m
2020-08-09 12:16:26,373 INFO (_log_results) �[47;30;1mSTDOUT (root@vm1)...
Status of volume: testvol_distributed
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick vm1:/mnt/disk2/testvol_distributed_br
ick0                                        49152     0          Y       19881
Brick vm2:/mnt/disk2/testvol_distributed_br
ick1                                        49152     0          Y       15478
Brick vm3:/mnt/disk2/testvol_distributed_br
ick2                                        49152     0          Y       15516
 
Task Status of Volume testvol_distributed
------------------------------------------------------------------------------
There are no active volume tasks

Error logs should only reflect failed operation which causes termination of the current test that's being run or the framework itself.

Ideally this is a glusto limitation and not a glusto-tests limitation, for this to be implemented we need to figure out if a command failure is expected or not and if it's expected to fail then not mark it as a Error. This would require us to change everything which seems like a difficult fix and isn't really giving any value to use.

@aloganat Please share your views on this and feel free to close it as WONTFIX.

run() method in glusto calls python ssh_obj.popen(). In some cases the output of popen() is dumped in the stderr pipe instead of stdout pipe and run() directly returns returncode, standard output, standard error of popen method.

This is a limitation of glusto, hence marking this issue as WONTFIX.