tqdm/tqdm

Native support with cloud logging

hadim opened this issue ยท 10 comments

hadim commented

I would like to make tqdm works when executed in a cloud env where logs are propagated as text to external services. I am mostly talking about Kubernetes here, but I guess it also apply to any other container orchestration tool.

Consider this simple snippet using tqdm 4.64:

import sys
from tqdm import tqdm
from time import sleep

for i in tqdm(range(10)):
    sleep(5)
    sys.stderr.write("\n")

When removing sys.stderr.write("\n") then the tqdm progress bar only display at the very end of the loop in one shot to 100%. It obviously defeats the purpose of the progress bar.

Adding sys.stderr.write("\n") trigger a log flush (or propagation) to wherever the logs are sent (Kubernetes console for example) and so the progress bar is displayed at regular interval.

This simple snippet is actually a workaround to make tqdm works with Kubernetes, but it does not reflect the complexity of a given stack where things like progress bar can be deeply nested into the codebase and so not "tunable" when executing a workload.

I have two questions here:

  • Is there is actually another more "automatic" workaround than sys.stderr.write("\n")?
  • Would you be willing to add to tqdm(range(10), ...) a new arg that can trigger this sys.stderr.write("\n") whenever the bar must be updated? We could eventually imagine auto-detecting (if possible) Kubernetes or simply setting a boolean flag.
    • On this it seems important to me to add a new tqdm arg instead of a more complex solution since in a deeply nested codebase, tqdm args is often the only thing you have control on. Auto-detection would be even better, if that something easily doable (I can dig into this if needed).
hadim commented

A naive proposal could be instead of:

def fp_write(s):
    fp.write(_unicode(s))
    fp_flush()

replace with:

def fp_write(s):
    fp.write(_unicode(s))
    fp_flush()

    if new_arg is True or kubernetes_auto_detected:
        getattr(sys.stderr, 'write', lambda x: None)("\n")
hadim commented

For the record here is the original Kubernetes argo I have already opened: argoproj/argo-workflows#7671 (but this is clearly not an argo issue).

hadim commented

Here is a POC monkey patch that works well:

import os
import tqdm.std
import tqdm.utils

disp_len = tqdm.utils.disp_len
_unicode = tqdm.utils._unicode


def _should_printer_print_new_line():
    in_kubernetes_env = os.environ.get("KUBERNETES_SERVICE_HOST") is not None
    tqdm_printer_new_line_enabled = os.environ.get("TQDM_PRINTER_NEW_LINE") == "true"
    return in_kubernetes_env or tqdm_printer_new_line_enabled


def new_status_printer(file):
    """
    Manage the printing and in-place updating of a line of characters.
    Note that if the string is longer than a line, then in-place
    updating may not work (it will print a new line at each refresh).
    """
    fp = file
    fp_flush = getattr(fp, "flush", lambda: None)  # pragma: no cover
    if fp in (sys.stderr, sys.stdout):
        getattr(sys.stderr, "flush", lambda: None)()
        getattr(sys.stdout, "flush", lambda: None)()

    def fp_write(s):
        fp.write(_unicode(s))
        fp_flush()

        if _should_printer_print_new_line():
            getattr(fp, "write", lambda x: None)("\n")

    last_len = [0]

    def print_status(s):
        len_s = disp_len(s)
        fp_write("\r" + s + (" " * max(last_len[0] - len_s, 0)))
        last_len[0] = len_s

    return print_status


tqdm.std.tqdm.status_printer = staticmethod(new_status_printer)

import sys
from tqdm import tqdm
from time import sleep

for i in tqdm(range(10)):
    sleep(5)
hadim commented

In the same spirit as #1318 we could also add an env variable to enable globally that feature TQDM_PRINTER_NEW_LINE=true

hadim commented

See #1320 for a POC.

hadim commented

Ping @casperdcl for visibility.

I've used something like #1172 to get around the very same problem before.
This would have the advantage that it doesn't require code changes to activate.
To me the scope also includes anything that doesn't output to the console. Although the Kubernetes issue is a console output it streams lines rather than characters.

I had also changed the default mininterval to 1 second to avoid too many lines.
(Additionally there was an issue with a potential duplicate line at the end)

Hello @de-code, do you know if there are any updates on this topic ?

Hello @de-code, do you know if there are any updates on this topic ?

I have not heard anything relating my PR (it reached it's anniversary). I was considering just putting it into a separate library.

I have released a package called tqdm-loggable that automatically converts progress bars to time-interval logged messages.

When the application is running in a headless mode, progress bar are not displayed, but instead you get a log message at a regular interval during the progress bar running:

tqdm_logging.py     :139  2022-09-21 12:13:56,307 Progress on:Sample progress 21.0kit/60.0kit rate:1,982.9it/s remaining:00:19 elapsed:00:10 postfix:Current time=2022-09-21 10:13:55.801787
tqdm_logging.py     :139  2022-09-21 12:14:06,392 Progress on:Sample progress 41.0kit/60.0kit rate:1,984.1it/s remaining:00:09 elapsed:00:20 postfix:Current time=2022-09-21 10:14:05.890220