ERROR level logs should reflect only real errors
BarakSason opened this issue · 2 comments
There are several cases in which logs are logged with ERROR level, but don't necessarily reflect real errors.
E.g:
2020-08-09 12:16:22,630 INFO (run) root@vm1 (cp): gluster volume info | egrep "^Brick[0-9]+" | grep -v "ss_brick"
2020-08-09 12:16:22,630 DEBUG (_get_ssh_connection) Retrieved connection from cache: root@vm1
2020-08-09 12:16:22,743 INFO (_log_results) �[34;1mRETCODE (root@vm1): 1�[0m
2020-08-09 12:16:22,744 INFO (_log_results) �[31;1mSTDERR (root@vm1)...
No volumes present
�[0m
2020-08-09 12:16:22,744 ERROR (get_servers_used_bricks_dict) error in getting bricklist using gluster v info
2020-08-09 12:16:22,744 INFO (run) root@vm1 (cp): gluster volume info testvol_distributed --xml
2020-08-09 12:16:22,744 DEBUG (_get_ssh_connection) Retrieved connection from cache: root@vm1
2020-08-09 12:16:22,917 INFO (_log_results) �[34;1mRETCODE (root@vm1): 0�[0m
2020-08-09 12:16:22,918 DEBUG (_log_results) �[47;30;1mSTDOUT (root@vm1)...
This log doesn't terminate the run, but will actually trigger volume creation (some lines later from the same log):
2020-08-09 12:16:22,918 INFO (run) root@vm1 (cp): gluster volume create testvol_distributed vm1:/mnt/disk2/testvol_distributed_brick0 vm2:/mnt/disk2/testvol_distributed_brick1 vm3:/mnt/disk2/testvol_distributed_brick2 --mode=script force
...
2020-08-09 12:16:26,261 INFO (run) root@vm1 (cp): gluster vol status testvol_distributed
2020-08-09 12:16:26,261 DEBUG (_get_ssh_connection) Retrieved connection from cache: root@vm1
2020-08-09 12:16:26,373 INFO (_log_results) �[34;1mRETCODE (root@vm1): 0�[0m
2020-08-09 12:16:26,373 INFO (_log_results) �[47;30;1mSTDOUT (root@vm1)...
Status of volume: testvol_distributed
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick vm1:/mnt/disk2/testvol_distributed_br
ick0 49152 0 Y 19881
Brick vm2:/mnt/disk2/testvol_distributed_br
ick1 49152 0 Y 15478
Brick vm3:/mnt/disk2/testvol_distributed_br
ick2 49152 0 Y 15516
Task Status of Volume testvol_distributed
------------------------------------------------------------------------------
There are no active volume tasks
Error logs should only reflect failed operation which causes termination of the current test that's being run or the framework itself.
Ideally this is a glusto limitation and not a glusto-tests limitation, for this to be implemented we need to figure out if a command failure is expected or not and if it's expected to fail then not mark it as a Error. This would require us to change everything which seems like a difficult fix and isn't really giving any value to use.
@aloganat Please share your views on this and feel free to close it as WONTFIX.
run() method in glusto calls python ssh_obj.popen(). In some cases the output of popen() is dumped in the stderr pipe instead of stdout pipe and run() directly returns returncode, standard output, standard error of popen method.
This is a limitation of glusto, hence marking this issue as WONTFIX.